Cue 
2 2 


ya PPO) 


) FP) fr) plier) fc 
Falendars 


Java & Digital imag 


1 bed 


‘gent therapies 


$4.95 ($5.95 CANADA) 


| 05> 


0 "70992"35566N"2 
A Miller Freeman Publication 




































































% i i ip i OF ORe i Vana j 

aise oe a ee poise ’ ens 5 Be POOPY 2 RAS ORS a Laan SE EEE 
seoganenaso nen snes Daan Ronaeamnto ta into seats innen os BREE ERR ; B ints eaia oan oonaonanasunaont, Cpototloian iota ed 
.. oo ee ES 


inne fev crore Roe erm oan Ce Rohe oR Reon eopereH Coe. 
EPA 2 AEDST PAPA EPS EZ DILDOS PRET AL IORLUT PORTO 








¥e Ere ti 7 £ 
Vens 
E 4 
sstensenorsianscatsentshinessnt MORSBSAIAUSR MURS DIN ONDER RSLISESI SORES ISAIMOI SM RNAI MORASS OISLOR LR GIN DAMIER ADINEIS AUN I RAIN RAR INARI AIR is 









Narre Name 


73 Transactions ss Batch Processing... Articles 


# ROA CRNA ARO NC RAINES ROAM N OR RPRI NC NORTHERN ORT IA 


arses & 





nebo. 








People Info 









enNeeiN=e 


rE EST 





Home Phone (650-358-9500. 






Work Phone-1650-358-9500- 





TORE Ree rycen TIERS CTT ERS 


Work Fax 650-358-9749 










jor Web 










LAL RELA LD LLL LOLI LOL 





ting 
royraphic 







ORRORORORIORORORINOR ORMOROROROR 






ORINOCO RM NERORERO NOMA RONEM 









a 





Browse 



































microsoft.cor 

















B99 Microsoft Corporation. All rights reserved. Microsoft, MSDN, and Where do you want to go today? are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. 


i 





SOFTWARE 


Dr Dobb's: 


TOOLS FOR THE 
jn en ee ee Gas We Poe | PROGRAMMER 





FEATURES 


OBJECT PERSISTENCE: BEYOND SERIALIZATION 

by Timo Salo, Justin Hill, Scott Rich, Chuck Bridgham, and Daniel Berg 

Our authors describe techniques and frameworks necessary to successfully implement scalable 
object persistence for complex database systems. Much of the technology they examine has been 
incorporated in development tools ranging from VisualAge for Java, to EJB tools for WebSphere. 


JAVA PROXIES FOR DATABASE OBJECTS 

by Paul Lipton 

Java proxy technology lets you define database object schema using the database ODL. To illustrate 
how such a technology might be implemented, Paul provides examples based on the Jasmine 
object-oriented database. 


VBSCRIPT AND SQL CALENDARS 

by John Donovan Lambert 

John presents the VBScripts he uses for inputting SQL results into a web calendar, and discusses 
how you can port these scripts to Java, Perl, Cold Fusion, or whatever language you prefer. 


THE CVS DATA FORMAT 

by Cesar A. Gonzalez Perez 

The CVS data format stores cartographic data for a specific geographic area into a single file. Cesar 
examines the format, then presents a tool for converting CVS files into DXF format. 


AGENT ITINERARIES 

by Russell P. Lentini, Goutham P. Rao, and Jon N. Thies 

Instead of examining itineraries in the traditional way as a list of tasks to be performed by agents, 
our authors treat itineraries as a metaprogram— a way of programming an agent and inadvertently 
its goal. To illustrate, they'll present an itinerary that performs a database query. 


JAVA AND DIGITAL IMAGES 

by David H. Martin and Johnny Martin 

Capturing, storing, and retrieving images is an often-overlooked feature that many applications 
could benefit from. David and Johnny describe “Grabber for Java,” an API that encapsulates the 
functionality necessary for video capture. 


THE SPARK REAL-TIME KERNEL 
by Anatoly Kotlarsky 
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EMBEDDED SYSTEMS 


SPARK, short for “Small Portable Adjustable Real-time Kernel,” is a royalty-free, fast, tiny, 
portable real-time kernel. Anatoly describes how he used it to build a video bar-code 


INTERNET PROGRAMMING 


AUTOMATED TESTING FOR WEB APPLICATIONS 88 


The technique for automated web-user-interface testing presented here is based on HTML, 
JavaScript, and CGI, and implemented for Netscape Communicator 4.04 and Apache 1.2. 
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THE VERSION CONTROL PROCESS 100 
by Aspi Havewala 


Source-code version control is a set of working rules for code sharing that lets 
developers modify files in an exclusive way. As such, it is one of the most 
important, yet least understood, areas of software development. 
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Real-world data management solutions 
are typically more complex when one 
examines the pieces, than initially 
recognized by the majority of database 
programers. All software projects are 
complex puzzles comprised of many 


details, most of which are data-related. 


Often today’s “DBMS” solutions sacrifice 
the speed or control essential for a 


competitive application. 


c-tree Plus®, by FairCom, has been the 
choice of commercial developers for twenty 
years precisely because it offers the 
flexibility and control at the detail level to fit 
a wide variety of data management needs. 
Proven on large Unix servers and 
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C++ components from Rogue Wave: the choice of champions! 


I. can be a stretch to create portable, scalable apps and deliver them on time. And Save the day! 
building and testing your own low-level C++ components can really put your game at risk. VY Fundamental classes 
Why not let Rogue Wave help you cover all the bases? With the integrated and reusable classes Networking classes 

in Tools.h++ Professional, Threads.h++, and Standard C++ Library, your development team can ¥Y Multithreading classes 


Y Solutions for Java/C++ 


get a head start on building a solid, high-performance foundation for every application. Know 
interoperability 
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Clear Cutting 
the Concept 


EDITORIAL 


going public. Sure, I’ve never made a profit. So what. That doesn’t stop me from pocketing 
sucker money— and, if Wall Street is a gauge, big bucks are there for the taking. 

Putting all seriousness aside, for me the height of the Internet stock lunacy was Prodigy’s initial 
public offering. Here’s an 11-year-old company that’s gone through at least $4 billion in funding, and 
never made a penny. But within 24 hours of its IPO, Prodigy stocks had doubled from $15 to $30 a 
share. What a concept. Then there’s Amazon.com, the online bookseller that encourages authors to 
review their own books and conduct interviews with themselves. According to Business Week 
(December 14, 1998), Amazon.com needs a growth rate of almost 60 percent per year for the next 10 
years to justify its December $214 per share price. For this to happen, Amazon.com has to have 
annual sales of $63 billion. However, as Business Week's Jeffrey Laderman noted, U.S. retail book 
sales in 1997 only totaled $11.8 billion, with relatively flat growth projections. By way of comparison, 
Microsoft’s annual growth over the last 13 years has averaged 43 percent. 

To increase revenues, Amazon.com has branched out into music and video, neither of which is 
a market much bigger than books. And to further justify its stock price, money-losing 
Amazon.com has invested in money-losing Drugstore.com. Selling toothpaste and shampoo over 
the Internet. Now there’s a concept. 

Speaking of which, if you are not a subscriber to DDJ, stop reading now! Oh, I’m just kidding — 
but Sun Microsystems isn’t, at least when it comes to its Java 2 source code (see http://www.sun 
.com/software/communitysource/java2/). According to Sun’s Java 2 license, licensees can use and 
modify the source code for commercial software development without charge. Good. Licensees can 
change the source code without returning the innovation to Sun. Okay. Finally, licensees can share 
source code and modifications only with other licensees. Huh? Does this mean that you can modify 
the source code, but can’t post it on your web site for peer review? Sounds like it, unless you add a 
warning like “Achtung! Halt! Verboten!” Ditto if you want to write a book or magazine article, present 
a conference paper, or otherwise share hard-won information with fellow programmers. Although 
counter to the accepted concept of “open source,” Sun believes this approach does, in fact, 
incorporate “the best of the Open Source model.” 

Upon reading the Sun press release (which, as a member of the press, I was allowed to do), I 
asked a spokesperson what the policy is regarding books, articles, and papers read by 
nonlicensees. She said they’d be back in touch real soon. Alas, I’m still waiting. 

As unclear as Sun is on the concept, it doesn’t hold a candle to 3Com/Palm Computing’s 
source-code license for the Palm OS (http://www.palm.com/devzone/rom3/srclicense.pdf). For 
one thing, all suggestions for improvement belong to Palm. Also, the license only gives you the 
right to “reproduce and display” copies of the source code at the “[street] address at which 
Licensee has registered.” You can’t “copy, duplicate, or otherwise reproduce” portions of the 
source code and send snippets of it to another person via e-mail, let alone put it on a laptop and 
study it on the train. In short, Palm’s license treats the source code as read-only documentation. 
Now, this isn’t all bad; a lot of jobs would be easier if, say, Microsoft did the same with Windows 
98. However, Palm’s implying that its code is “open source” is like Sam Goldwyn saying a verbal 
contract isn’t worth the paper it’s written on. Clearly not clear on the concept. 

Internet companies aside, an up and coming concept is home networks. What with the 
availability of cheap PCs, 15 to 20 million U.S. homes now have two or more computers, 
according to Dataquest. Consequently, multiple-PC households want to share files, assorted 
peripherals, and (eventually) Internet access. Curtently, says IDC, 1.9 million U.S. homes have 
networks, growing to 12 million homes in a few years. Today’s $200 million in home-networking 
sales are expected to be $1 billion in five years. — 

Several companies, mostly on the cabling and hardware side, are gearing up for home 
networking —3Com, ShareWave, Phillips Electronics, Epigram, Proxim, ActionTec, and Tut 
Systems, among others. Some are providing wireless solutions, others are using phone and 
electrical lines, still others are using standard network cables. And, surprise, there’s even an 
industry trade group — the Home Phone Networking Alliance (http://www.homepna.org/). 

It’s nice to share printers and files, but the problem is the paucity of important stuff— network 
games. So, with all the focus on Internet-aware development, take a look at opportunities in the 
network-enabled home software market. It may be a concept whose time has come. In the 
meantime, keep your eyes open for my IPO, which should pop up any day now at Charles 
Schwab’s web site at http://w— oops, the site just went down again; third time this month by my 
count. Looks like I’m having better luck at becoming an Internet company than Charles Schwab. 


Whicker 


Jonathan Erickson 
editor-in-chief 
jerickson@ddj.com 


“4 i, 'm “jerickson.com, an Internet company,” and, like other Internet Wunderkinder, I’m 
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Java Provider Update 
Dear DDJ, 
The sample code included with my arti- 
cle, “The Java Provider Architecture” 
(DDJ, March 1999) was written using the 
Early Access release of the Java Cryp- 
tography Extension 1.2. I’d like to let 
readers know that, since the code was 
developed, a newer release of the JCE 
(Release Candidate 1) has been made 
available. Some of the minor changes and 
improvements made by RC1, most no- 
tably the change of the argument type of 
the constructor of class PBEKeySpec from 
String to char[/ and the appearance of 
the additional methods enginelnit() and 
engineGetParameters() in class Cipher- 
Spi, cause compile errors if you attempt 
to compile my code. To rectify this, sim- 
ply add the two new methods to Enig- 
ma.java and make the appropriate change 
to the argument of the PBEKeySpec con- 
structor in EnigmaTest .java. 

Paul Tremblett 

paul_tremblett@beechwood.com 


Better Late than Never 

Dear DDJ, 

I am writing this, having just purchased 
the December 1998 issue of DD/. As you 
can see, we're already into 1999, but that’s 
not really my fault— that’s when the mag- 
azines get here. 

The name D-Flat was originally stated 
as a relationship to C (the language you 
wrote it in, obviously), and a musical al- 
ternative to C sharp. “D flat is a difficult 
key to play in” was what I believe you 
wrote in at least one of your columns. I 
suppose you should call it a pun...I be- 
lieve I still have most if not all of the orig- 
inal columns. I also managed to write 
some useful applications with it (all of 
which have now bit the dust thanks to 
Windoze). 

So, do I win the invaluable (or was that 
valueless...) prize? (despite being the 
1,527th respondent). 

A.P. Madden 

Australia 

amadden@pcug.org.au 
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Al replies: Andrew, you might win the 
prize for the entry from the most distant 
location. Although, in this Internet era, 
that’s a dubious distinction. I don’t think 
I ever said that D> is a difficult key, but 
the rest of your memory is accurate. 


CD Authoring 
Dear DDJ, 
In the article, “A Java Applet Search Engine” 
(DDJ, February 1999), Tim Kientzle com- 
plained that he wasn’t able to find any CD 
mastering package capable of generating 
hybrid ISO/RockRidge/Joliet/HFS (DOS/ 
Unix/Windows/Macintosh) CD ROMs. I am 
happy to say that his wish has been grant- 
ed: mkhybrid, written by James Pearson 
(j.pearson@ge.ucl.ac.uk) is the program Tim 
was looking for. It is freely available in the 
source-code form at http://www.ge.ucl 
ac.uk/~jcpearso/mkhybrid.html. Several oth- 
er free CD mastering tools (including CD 
burning software, digital audio extraction 
tools, GUIs, and much more) are listed at 
the CD building project for UNIX (http:// 
www.fokus.gmd.de/research/cc/glone/ 
employees/joerg.schilling/private/cdb.html). 
Taken together, these tools make for a nice 
CD mastering environment indeed. 
Serguei Patchkovskii 
patchkov@ucalgary.ca 


Full-Text Searching 


Dear DDJ, 
Thanks to Tim Kientzle for his article “Full 
Text Searching in Perl” (DD/, January 1999). 
DDJ readers might be interested in being able 
to search the generated databases from PHP3 
(http://www.php.net/), the popular web- 
server language. They can find scripts that I 
wrote to make this possible at http://www 
sheddley.com/edd/php/search.html. 
Unfortunately PHP doesn’t guarantee 
Berkeley DB support so I created a GDBM 
version as well, which doesn’t work as 
fast, for those whose PHP implementa- 
tions didn’t do DB (although there is some 
information on the page for how to re- 
compile your PHP for DB support). 
There are also a couple of other alter- 
ations for PHP in there that get around 
the fact that PHP doesn’t support NUL in 
database keys, and neither does it have 
the pack/unpack functions from Perl. 
Edd Dumbill 
edd@heddley.com 


Dear DD] 
I enjoyed reading about Berkeley DB in 
“Full-Text Searching in Perl,” by Tim Kient- 
Zle (DDJ, January 1999), but would like to 
respond to a few of the points Tim raised. 
First, the performance and reliability 
problems that Tim reported were caused 
by using a several-year-old academic re- 
lease of Berkeley DB. That version of the 


software has never been commercially 
supported and is known to have serious 
flaws. The current Berkeley DB release is 
API compatible with Tim’s use, and suf- 
fers from none of the problems he listed. 

Second, we have worked hard to make 
the current release easy to download and 
install. Regardless of the Perl module being 
used, we strongly encourage Perl develop- 
ers to download and use the latest code. 

Finally, while we have kept our inter- 
faces intuitive and easy to use, Berkeley 
DB is not a “simple database.” It provides 
the features that developers expect from 
high-end commercial offerings, includ- 
ing full transactional support, disaster re- 
covery, hot backups, and scalability in 
both number of users and volume of data. 
Berkeley DB supports 100 GB databases 
in major commercial products just as eas- 
ily as it does everyone’s favorite Perl ap- 
plications. 

Michael A. Olson 

Sleepycat Software 

mao@sleepycat.com 


Tim responds: Thanks for your note, 
Michael. Unfortunately, the current Berke- 
ley DB 2.X is not file compatible with the 
Berkeley DB 1.85 I used. Consequently, 
neither will it work with the Java applet 
I presented in “A Java Applet Search En- 
gine” (DDJ, February 1999). Interesting- 
ly, most of the problems I was experi- 
encing disappeared when, thanks to a 
suggestion by Paul Marquess, I reduced 
the memory requirements of my Perl 
code. 


Taming C++ 

Dear DDJ, 

I enjoyed reading Al Stevens’ “C Pro- 
gramming” column in the December 
1998 issue of DDJ. I have been reading 
DDJ since another Al—Al] Williams — 
was talking about “Roll Your Own DOS 
Extender” back in October 1990. Yes, I 
did give it a roll and enjoyed trying to 
use every security feature built into the 
386 and finding out firsthand just why 
no one actually uses every security fea- 
ture. Yes, like Al, I find research is im- 
portant. I like to explore ideas and go 
deeply into software topic to find design 
inspiration. 

As you might have guessed, I am a pro- 
grammer and I am currently working for 
an organization in the financial services 
sector. There is a diverse mix of platforms 
and technologies, so I have plenty of va- 
riety, but most new development work is 
done in C and C++ and in which I have 
also had the most experience. I have 
worked on GUI application development 
for both Windows and OS/2 and find it very 
enjoyable. I think this is because it is both 
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(continued from page 10) 
challenging and the results are visible, which 
is often not a programming outcome. 

What interested me particularly about 
Al’s column is that he is planning to em- 
bark on D-flat 2000 as I am currently 
working on a GUI application frame- 
work as a “research” project in my spare 
time. I’m calling my project “Classic” as 
it will be the culmination of all I have 
learned about GUI application pro- 
gramming, hence I hope worth pre- 
serving, plus the name contains “class,” 
which is appropriate for an object-ori- 
ented design. Yes, I remember Al say- 
ing that D-Flat is the same as C-Sharp 
on the musical scale, hence a disguised 
way to get “C” in the name. 

What struck a chord with me was Al’s 
comment that “Recent forays into tem- 
plates have convinced me that...tem- 
plates are an ideal medium with which 
to express abstractions.” I feel that it is 
essential to express abstractions clearly 
in one’s code to combat the inevitable 
complexity that comes with developing 
significant applications. I agree that tem- 
plates lend themselves readily to ex- 
pression of abstractions. I have been par- 
ticularly inspired in this regard by Jiri 
Soukup in his book Taming C++: Pat- 


tern Classes and Persistence for Large 
Projects (Addison-Wesley, 1994). He sug- 
gests ways to implement pattern class- 
es, including using templates. I have tak- 
en some of these templates and 
developed them for use as a small foun- 
dation class library. 

Taming C++ sums up my aspiration for 
the effect of using pattern classes in my 
application framework. I can clearly ex- 
press the roles classes play and their re- 
lationships to each other. Up to now, such 
information always seems to disappear as 
implementation in code proceeds. 

Currently Classic is a set of foundation 
classes and a tiny application that pops up 
a dialog box with two buttons and some 
static text. Not very exciting, but I like us- 
ing the pattern class templates, and I think 
their potential is exciting. For example, I’m 
using a Composite class as a container for 
my menu bar, and it stores the menu items 
in a hierarchical structure that reflects the 
hierarchical menu structure. This is 
achieved in an unobtrusive way. I don’t 
have to create any special functions or add 
any variables, the pattern behavior is ac- 
quired by simple inheritance. 

Here is how I define class menu_com- 
ponent, from which I define menu_item 
and popup_menu: 


class menu_component : public 
class_tag<menu_component> 


Here is how I define the container to 
hold all menu components: 


Composite<popup_menu, menu_component> 
menu_composite; 


My adapted pattern classes are easier 
to use than the originals, but do suffer 
from some loss of performance. This is 
because I create the required infrastruc- 
ture dynamically, while this is done in 
Taming C++ statically, at compile time. 
However, I don’t think the loss of perfor- 
mance will be significant, especially be- 
cause I have optimized iteration, which 
would be the most common method of 
object navigation. Furthermore, I will be 
able to create a “pattern class develop- 
ment tool” for class modeling and design. 

Anyway, I encourage Al to proceed with 
D-Flat 2000. I would really like to see what 
materializes. 

Andrew Bowley 

bowley@enternet.com.au 
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Tip: 


Dim db As Database 
Set db = Workspaces(0).OpenDatabase(strSourcePathAndFile) 
db.execute “SELECT tbl.field INTO_ 
[dbms type;DATABASE=DestinationPath].[FileName]_ 
FROM [SourceTable]” 


That's all there is to it, but it's a bit cryptic so here's what each thing 
means: tbl field is the field that you want to put in the destination 

database. It can be a single field, multiple fields or “*” for all fields; 
dbms type is the ISAM database type; Destination:ath isthe hofthe | 
destination database; FileName isthe name ofthe destination = | 
database; and SourceTable is the source mes 
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DBCombo controls in VB, True DBList 
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OLE DB and ADO support; Enhanced 
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Full support for IE; Multi-column 
searching and sorting; Formatted 
preview and printing; and much more! 
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tracking solution for Windows 
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LEADTOOLS is a family of 
comprehensive toolkits designed 
to help programmers integrate 
color, grayscale, document and 
medical imaging into their 
applications quickly and easily. 
Whatever your programming 
needs, LEAD has a toolkit specifi- 
cally designed to give you the best 
imaging technology available with Paradise No. 
the stability and dependability you LO5 0141-FB 
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Button Objx 2.0 enhances any Web 
or Microsoft® Windows® application. 
Not only can you replace the 
Windows button control to create 
visually enhanced buttons, you can 
create fully customized active 
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custom-shaped containers. Also 
included is a Balloon control for 
adding customized help balloons Paradise No. 
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editor for Windows, features 
include: fast code browsing with 
Outline Symbols; Difference Editing 
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quickly and easily. Just think, it 

takes only one method call to turn 
your VB app into a robust, manageable 
service that can run unattended— 
even with no users logged in. You can Paradise No. 
even monitor and control services on 006 0310-FB 
remote machines! 
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TRACKGEAR™, the Web-based bug track- | 
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National Righlnete” 

Week at UC Berkeley | 

Part academic symposium, part sales pitch, 
the University of California Berkeley's Elec- 


trical Engineering and Computer Science _ 
(EECS) department’s annual conference 


presented some of the department's re- 
cent research efforts. Held in. conjunction 
with National Engineets Week, the con- 
ference featured three talks that highlighted 
some of the department's more interest- 
ing research—Richard Newton on “Soft- 
_ ware on Silicon,” Bob Brodersen on “Goals 
of the New Berkeley Wireless Research 
Center,” and Dave Patterson. on “Post- PC 
Computer Architecture.” 


Newton focused on the growing pro-- 


ductivity gap in integrated circuit design, 
a major concern at Berkeley’s Gigascale 


Silicon Research Center. The problem is 
that as Moore’s Law continues to hold, the 
potential capability of chips is growing — 
faster than our ability to design such com- _ 
plex chips. Two possible solutions to this _ 
dilemma are to improve existing chip de- 


sign software and to use programmable, 
general-purpose microcontrollers rather 
than application-oriented chips. Newton 


seems to favor the latter solution, noting 


that it shifted the complexity from the 
hardware design to the software. One 


drawback to this approach, however, is. 


the higher power requirements for pro- 


grammable chips. According to Newton, - 
there is a four-orders-of-magnitude dif. : 
ference in erformance per ower uae 
Hana _ Patent Suits 
‘Everything that Microsoft does these days 
seems to result in public outcry. Its not — 
really surprising: People like to root for 


purpose chips. 
 Brodersen took the Gppeuite position, 


saying that with better software design 
tools, there was no reason why we 
couldn’t overcome the productivity gap in | 


IC design. The focus of Brodersen’s talk, 
however, was not on the difficulties of 


_ chip design, but on the formation of the 


new Berkeley Wireless Research Center. 


- Part of the motivation behind the center | 
came from some of the collaboration dif- 


ficulties researchers had with the devel- 
opment of Infopad, a wireless, consumer 
information prototype that was developed 


jointly by university and industry re- 
searchers. One unique aspect of the Cen- 
ter is that all of its work is.in the public 


domain. 
Finally, David Patterson diceumed IStore, 


a highly reliable, highly scalable storage _ 


system designed to act as the infrastruc- 


ture for next- pene portable com- — 
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puting devices. Based on his experiences _ 


with RAID and feedback from companies, 


Patterson realized that several of his pre- 
vious design assumptions— such as me- _ 
chanical disks being the most unreliable 


part of the storage system, and companies 
considering cost the highest priority — were 
incorrect. One of IStore’s many new fea- 


tures is the use of multiple free versions 


of UNIX, the assumption being that since 
all of the free versions of UNIX run Linux 
binaries, if one particular version has a 
kernel bug that causes it to crash, the oth- 


er versions will not have that same bug 
and will continue (orn. | : 
— — Eugene Bric Kim 


Rocyctiil That PC 


IBM has released (what it claims to be) 


the world’s first desktop PC made from. 
100 percent recycled plastic resin for all, 
‘major plastic parts. Contrary to common 
expectations, the IBM IntelliStation E Pro, 


which contains 3.5 pounds of plastic, was 


- converted from a prime resin to recycled 
plastic at no extra cost. In fact, Some of 
the recycled parts actually cost 20 percent 
_ less to manufacture. According to an EPA 
study, approximately 10 percent of the 


weight content of municipal landfills is 


: plastic waste. For its part, the IntelliStation 


is built around a 450-/500-MHz Pentium III 


microprocessor and IBM’s Fire GL1 3D 


phic acceleration subsystem. — 
oa [Onn Erickson 


the underdog, and Microsoft makes Go- 


liath look like a chihuahua. On the flip — 


side, attacking Microsoft is an easy way 


to detract attention from yourself and win 


public support, and on occasion, Microsoft 
is unfairly victimized. 

Five years ago, Michael Doyle cf Eolas 
Technologies received U.S. Patent 5,838,906, 
entitled, “Distributed hypermedia method 
for automatically invoking external appli- 
cation providing interaction and display 


_of embedded objects within a hypermedia 
In other words, Doyle was © 


document.” 
claiming (and was awarded) a patent on 
Java applets, browser Be and ActiveX 
controls. 

Predictably, many people expressed 
outrage, including DDJ editor-in-chief, Jon 
Erickson,.who mentioned the patent in his 


http: // WWW .OSgi.org/. 


November 1995 editorial. Doyle respond- 


ed in a letter to DDJ- 


We are not asking browser companies to 
_ pay royalties for developing browsers that . 
can run applets. Rather, we are only re- 
_ quiring that they adhere to a standard “Web- 

API” that will be defined by a consortium 

_ of Eolas licensees. This will accelerate the 
_ rapid pace of interactive application devel- 
opment on the Web, not hinder it. 


Regardless of how misguided Doyle’s ac- 
_ tions were, at least he seemed to have 
good intentions. 


. In February, 1999, Eolas quietly an- 
Soiced that it was suing Microsoft for 


_ patent infringement, asking for unspecified 


damages and an injunction forcing Micro- - 


‘soft to cease manufacture of Internet Ex- 


plorer. For legal reasons, Doyle refused © 
comment on the suit, but it’s a safe bet that 
Eolas is not simply “encouraging” Microsoft 


to conform to Eolas’s Weblet API. 


— Eugene Eric Kim 


Open Service 


Gateway Spec Planned | 

The yet-to-be released Open Service Gate- 
way Specification is being touted as the in- 
dustry’s first open interface for connecting 


consumer and small business appliances 
_ with Internet services. Backed by a con- 
sortium that includes Alcatel, Cable & Wire- 


less, Electriciti de France, Enron Commu- 
nications, Ericsson, IBM, Lucent, Motorola, 
Network Computer, Nortel Networks, Ora- 


Cle, Philips, Sun, Sybase, and Toshiba, the 


spec's Java-based environment will provide 
a common foundation for ISPs, network 
operators, and equipment manufacturers to 


_ deliver a wide range of Internet services to. 


enable the consolidation and management 
of voice, data, and multimedia communi- 
cations to and from home. The specifica- 
tion will also be designed to provide secure 
wireless or wired links between high- 


- value home services— such as security, 


energy management, emergency health- 
care, and electronic commerce services — 
and the computer systems of external © 
computer networks and ISPs. The speci- 
fication, which is due for release in mid- 
1999, will consist of application framework | 
and resource management, client APIs for 


thin and fat WANs, device APIs for LANs, 


security and integrity APIs, and data man- 


agement APIs for database integration 


administration. For more information, see 


ao fonaoan Pees 
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It routinely manages code bases of _ 
more than 100,000 files, including | / _ 
source, document, and Web conte! 1 


It effortlessly scales to hundreds of — 
concurrent users. _ 






It works on more than 30 platform 
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It works equally well over the 
Internet, the office LAN and the 
global corporate WAN. 


It controls the evolution of multiple, | _ 
concurrent development and release _ 
code lines. 


It features near-zero downtime and 
near-zero administration. 


If you asked “What is Perforce?” you're clued in t to Solving your 
-toug hest code ‘management problems. 


Try a cop Py from www perforce. com, and call us for free technical support to help with your 


evaluation. Don'ts worr about “ content- t-firee” salespeople calling you - we believe the Fast 


ura or Management s system speaks. for itself. 
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t acouple of clicks. You can integrate Microsoft Project 98 data with applications you 

e via e-mail or the Web. Try doing any of these tasks with your 
or call 1-888-877-9092. 


You can update an entire plan with jus 
already use. And everyone on your team can stay up-to-dat 
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ost commercial high-volume databases are based on ei- 
ther the relational or service paradigm (that is, databas- 
es encapsulated within transaction processing monitors). 
Persisting objects in these nonobject-oriented databases 
is a major challenge when building large-scale applications. 
On a small scale, object persistence is easy to solve. Seri- 
alization, for example, has been presented as a method for 
providing simple object persistence. However, scaling up in- 
troduces a new set of requirements. Many enterprise object 
systems involve object models with complex inheritance hi- 
erarchies and large numbers of object relationships. The run- 
time configuration often includes multiuser databases that can 
be both relational and nonrelational. The object model and 
database model are often designed by different groups of peo- 
ple, therefore requiring a loose coupling between the mod- 
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els. The design of a scalable object persistence framework 
must adequately address issues related to performance with 
complex object models, support for complex object transac- 
tions, transformations from object inheritance structures and 
associations to native database structures, translating object 
queries to native database queries, and accessing objects across 
multiple database paradigms. 

There are several standards and specifications related to ob- 
ject databases and object persistence, including the Object Man- 
agement Group (OMG) Standard, Object Database Manage- 
ment Group (ODMG) Standard, and Enterprise JavaBeans (EJB) 
Specification. However, none of these specifications address 
the actual implementation of a persistence engine. At best they 
describe interfaces and high-level components that form the 
API of the system. 

In this article, we'll describe techniques and frameworks re- 
quired to successfully implement scalable object persistence 
for complex systems. We’ll address topics such as required 

(continued on page 22) 
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Figure 1: High-level architecture for a persistence framework. 
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metainformation, read-ahead and caching, queries, object as- 
sociations, and concurrent and nested transactions. We have 
pioneered these techniques for almost 10 years in many large- 
scale projects. Various aspects of the technology we describe 
have been incorporated in IBM development tools, including 
VisualAge for Java (Persistence Builder), VisualAge for Smalltalk 
(ObjectExtender), and EJB development tools for WebSphere. 


General Architectures 

Persistence frameworks typically consist of two high-level 
components: the development-time toolkit and the run-time 
persistence engine. Figure 1 is an example of high-level archi- 
tecture for a persistence framework. 

The development toolkit usually includes tools for collecting 
metainformation about the object model and database, and tools 
for generating business object classes and database queries. 

There are two approaches for implementing the run-time 
engine. One approach is to have the metainformation avail- 
able at run time, and generate the queries for retrieving ob- 
jects on-the-fly as the application traverses various object re- 
lationships. This approach makes it possible to build dynamic, 
flexible applications that have no navigation restrictions with- 
in the object model. However, the amount of memory used 
by the metainformation and run-time query generation usu- 
ally results in poorer performance. Another approach is to 
generate the queries at development time. Little explicit metain- 
formation is needed at run time with this approach. Execu- 
tion of the generated queries is faster, because run-time in- 
ferencing is not needed and the queries can often be optimized 
for the database. The drawback is that the object model traver- 
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Figure 2: Relationships between various metamodels. 
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sal paths are fixed. If more paths are needed, more queries 
need to be generated and compiled. 


Metainformation 

Metainformation for an object persistence framework includes 
information about the application’s object model, the target 
database’s data model, and the queries needed to service the 
application. As Figure 2 shows, the information is often grouped 
into the following models: 


¢ The data model for describing the relevant subset of the 
database schema. 

e The persistent object model for describing the persistent com- 
ponents of the business domain model. 

e The mapping model for describing the mapping between the 
object model and the data model. 


How much detail is captured and whether the metainforma- 
tion is partitioned in one large model or various separate sub- 
models depends on issues of flexibility, efficiency, and expres- 
siveness. Therefore, there is no single correct way to package 
the information, but all the following must be captured in some 
form somewhere in the framework. 

The data model represents the logical view of the database. It 
is a subset of the tables, views, and columns in the database schema 
that are relevant to object systems. This includes information on 
entity qualifier names, logical and physical names of entities, col- 
umn datatypes, and conversions from database types to object lan- 
guage types. Further refinements could include information on 
database column functions such as sums and averages. 

The data model can be augmented with information that is 
not explicitly kept in the database schema. For instance, the 
relationships implicitly defined by the foreign-key references 
in the schema can be modeled as first-class connection ob- 
jects in the data model. Enhancing the data model with con- 
nections makes the mapping of object associations to database 





Figure 3: A structural data model. 
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(continued from page 22) 
relationships a significantly simpler task. Figure 3 presents an 
enhanced data model (the structural data model). 

It is not absolutely necessary to have a separate data mod- 
el. However, without such a model, much of this information 
must be captured in the mapping model, thus overloading its 
behavior and state. 

The persistent object model is a subset of the application’s 
object model. It represents only the portion of the business ob- 
ject model that requires persistence behavior. It can be a sub- 
set of classes within the complete business object model and a 
subset of the instance variables within a single class. The al- 
lowed types for the attributes can also be captured for valida- 
tion purposes. 

Besides modeling the simple attributes, the associations be- 
tween the persistent objects can also be modeled. This makes the 
object model independent from the mapping model, allowing a 
clear mapping between the foreign-key relationships in the data 
model and the object associations in the persistent object model. 

The definition for the object identifiers can be captured in the 
object model rather than in the mapping model, again allowing 
simple mapping between the primary key column(s) in the data 
model and object identifier in the object model. 

The persistent object model is optional, and much of the in- 
formation that it provides can be held in the mapping model. 
However, without the object model (as well as without the data 
model) there is a risk of overloading the behavior and state of 
the mapping model. 

The most minimal system that would be of any interest re- 
quires at least a model of mapping between the object struc- 
ture on one side and the target database structure on the 
other. The mapping model contains the essential instructions 
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Figure 4: Various class-to-table mapping schemes. 
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to the system of where the data retrieved from the database 
is to be placed in the objects. The mapping model must de- 
fine which object class corresponds to which table and which 
object attributes correspond to which columns. Refinements 
could include mapping one object to multiple tables and one 
instance variable to multiple columns, conversions of column 
data from primitive types to higher-level object types, and 
defining which columns act as database-conflict- detection 
predicates. Figure 4 shows examples of class-to-table map- 
ping schemes. 

If associations are to be supported transparently, then the map- 
ping must also define which foreign-key relationship corresponds 
to which object association in the object model. Figures 5 and 6 
illustrate various relationship-mapping schemes. 

Finally, if inheritance is supported then the mapping model 
should capture all such information. This would include the type 
of inheritance employed in the database, type discriminator val- 
ues for choosing the appropriate class, and/or foreign-key re- 
lationships between tables. Figure 7 shows examples of inheri- 
tance mapping schemes. 


Cache 
Various read-ahead and caching strategies can improve a per- 
sistence framework’s efficiency and flexibility. Without read- 
ahead.and caching capabilities, the application is always starved 
for data, parsimoniously reading from the database as associa- 
tions in the persistent object model are traversed and bringing 
back data only one level at a time. With an object model that 
has many relationships, this can cause a large number of ex- 
pensive database roundtrips. 

A read-ahead scheme lets the application minimize the num- 
ber of database roundtrips by retrieving large object composition 
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Figure 5: 1:1 association mapping schemes. 
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trees within one query. Read-ahead involves instantiating the 
requested objects and caching the data for their related objects, 
thereby making sure that the data is present for the objects that 
are most likely needed next by the application. How far ahead 
objects are read is determined by application requirements. Flex- 
ibility is gained as the queries can be tuned without affecting the 
structure or workflow of the application. 

Reading objects ahead often results in too much data. There- 
fore, it is desirable to keep the data in binary format to delay 
or avoid the performance cost of instantiating unused objects. 
Instantiation of persistent objects is then performed in two 
stages: First, the data is brought into the cache, then the ob- 
jects are instantiated from the cache upon demand. Leaving 
the data in a form that is smaller than a fully instantiated ob- 
ject saves space as well. 

The key to implementing the read-ahead feature is to extend 
the caching scheme to include the relationship semantics of the 
underlying database. Database queries have fixed access paths 
that may differ from the object model navigation order. There- 
fore, the data in the cache must be organized in a fashion that 
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allows dynamically composing any access paths defined in the 
database. In the case of relational databases, this means that 
the foreign-key references are extracted from the result set and 
maintained in a structured data cache. Figure 8 shows a struc- 
tured data cache. 


Registry 

To guarantee the uniqueness of the objects within the appli- 
cation’s memory, each instantiated persistent object must be 
registered into a centralized registry. The objects are usually 
identified in the registry using their persistent object identi- 
fiers; see Figure 9. 

As Figure 10 illustrates, when an object is retrieved using its 
object identifier the registry is searched first, then the data cache, 
and finally the database. The registry can be global if it is im- 
plemented using weak pointers, because objects are automati- 
cally removed from the registry when other objects no longer 
reference them. However, if weak pointers are not available, the 
registry must be localized. For example, transactions provide a 
good scope for local registries. 
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Figure 7: Inheritance mapping schemes. 
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Queries 

From the persistence framework’s point of view, queries are the 
behavior of persistent objects on their target database. Query in 
this context means any operation supported by the target database 
and executed by the persistence framework. This includes ba- 
sic create, read, update, and delete operations; inquiries (does 
an entity exist in the database, the sum of a set of columns); 
and specific operations defined by a particular database server 
such as “balance the account.” 

Invocation differences between different target datastores in- 
clude details such as native query representation, error handling, 
and result data interpretation and processing. The native query 
representation typically can be strings (as with dynamic SQL), 
host variables (static SQL, stored procedures), or records (main- 
frame messaging). 

Encapsulating the native query details within query objects 
can standardize target database invocation. For instance, an ob- 
ject application would never know whether the query object 
contains a SQL string, or invokes a stored procedure or a mes- 
sage to a mainframe transaction- processing monitor. Figure 11 
presents two sets of encapsulated queries targeting two differ- 
ent types of datastores. 

Queries can be grouped into two broad categories — write 
queries (SQL insert, update, and delete, for example) and read 
queries (SQL select). 

Input for write queries can be either keys (for instance, delete 
an object based on its key) or full objects Gnsert an object); ei- 
ther of which can be collections. Queries targeting relational 
databases operate on a single object. Queries targeting stored 
procedures or mainframe transaction-processing monitors usu- 
ally take multiple objects as input parameters. 

Write queries extract the data from persistent objects and con- 
vert it to the target database form. Depending on the datastore, 
the data is placed into a query string, a query’s host variables, 
or a record structure. In the case of nested records (mainframe 
messaging), the data may also need to be recomposed accord- 
ing to the nesting structure; see Figure 12. 

Because relational write queries can operate only on one 
object at a time, the number of database roundtrips within a , 
complex transaction often becomes high. A useful performance 
optimization is to group the native queries together, then send | 
them to the database as one package at the end of the trans- 
action. Many relational databases support this kind of “batch” 
behavior. For procedure calls this is the typical mode of op- 
eration. 

Read queries fall into two categories — those that have no 
scope limiting conditions (“all instances” queries, for exam- 
ple) and those that require parameters for search conditions 
(“finder” queries). Read queries that require parameters must 
address the same data conversion and recomposition issues 
as the write queries. 

Restructuring the resulting data is necessary when the data is 
not shaped along object lines and/or the result contains data for 
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more than one kind of object. For example, queries involving 
certain inheritance strategies or reading ahead trees of objects 
require joins and unions that result in tuples containing data for 
multiple objects. A useful abstraction for the result processing 
is a data extractor. The data extractor contains all the necessary 
logic to extract, convert, validate, 
and compose the data into a form 
suitable for the target persistent 
object. In case of relational joins, 
the extraction logic must also elim- 
inate redundant entries in the re- 
sult set; see Figure 13. 

To optimize the number of 
database roundtrips, the read 
queries need to be capable of 
loading trees of objects rather than 
reading one object at a time. The 
required native operations for re- 
lational queries are equijoin for 
loading chains of objects, unions 
and set differences for loading trees, 
and left-outer-joins for loading 
trees that allow missing leaves. 


Associations 

Describing the associations be- 

tween object classes is an essential element of object modeling 
and design. UML and other object modeling methodologies pro- 
vide ways of defining the semantics of associations in terms of 
their cardinality and navigability. 

The behavior of associations can be fairly complex. The im- 
plementation details can be hidden behind accessor methods 
(get methods). Accessors for one-to-one associations return the 
member object of the association. An accessor for a one-to-many 
association returns a collection of member objects. Another ap- 
proach (see Figure 14) is to implement associations as first-class 
objects (in-place association instances, proxies). 

At run time, the object referential integrity should be main- 
tained according to the semantics specified in the objects mod- 
el, while allowing the application programmer the easiest and 
most flexible interface to the relationships. Mutators (set meth- 
ods, for example) and collection add/remove methods should 
automatically invoke the appropriate referential integrity main- 
tenance behavior, such as updating the inverse association. 

Associations are especially important for persistent objects 
mapped to relational databases because associations can also 
provide automatic means for maintaining the database key ref- 
erential integrity. When connecting persistent objects, the asso- 
ciation will determine which persistent object holds the foreign 
key and update it appropriately with the primary key of the 
other object. Manually coding the database key maintenance 





Figure 10: Search sequence when retrieving objects. 
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is error prone and can easily lead to unmaintainable code. 
Figure 15 illustrates automatic maintenance of object and 
database key referential integrity. In this example, an employee 
object is automatically removed from its old department when 
the object is added to a new department. Also, the inverse re- 
lationship from the employee to 
the department is updated au- 
tomatically. 

Associations provide a seman- 
tically meaningful way for con- 
trolling the retrieval of objects 
from the database. As the appli- 
cation traverses associations, the 
related objects can be retrieved 
accordingly. Depending on the 
association, it is sometimes also 
desirable that traversal of one as- 
sociation triggers the retrieval of 
an entire graph of related objects. 
However, this kind of object 
graph read-ahead behavior re- 
quires advanced querying and 
caching techniques as described 
in the previous sections. 

Translation from the object as- 
sociations to the native database 
relationships may be very complex (see Figure 16). Simple re- 
lationship between two classes often translates to multiple rela- 
tionships between multiple tables when inheritance is involved. 


Transactions 

In enterprise environments a single server application may serve 
multiple concurrent client transactions, each accessing an over- 
lapping set of objects. 

Many enterprise applications that reflect complex business 
processes (see Figure 17) also require that users can navigate 
freely between different views of the user interface, work 
with the result of uncommitted changes across views, and 
commit or cancel work that has been done on a view and 
on all subviews opened in a nested fashion. In short, the na- 
ture of complex multiuser enterprise applications requires 
that objects can be accessed from multiple concurrent and 
nested transactions. 

To ensure the consistency of concurrently running transac- 
tions they need to be isolated from each other. The two meth- 
ods for isolating the transactions are the conflict avoidance 
scheme (“pessimistic” scheme) and the conflict detection 
scheme (“optimistic” scheme). Which one to use depends on 
the type of transaction. Transactions that have a high penalty 
for failure should do whatever possible to prevent the failure 

(continued on page 30) 
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(continued from page 27) 

(by explicitly locking the resources as early as possible). With 
low penalty transactions it is often worth trading the risk of 
failure to gain efficiency by using a conflict detection scheme. 

The objects are copied from the database into the applica- 
tion’s memory, where they may be held for extended periods 
of time. Therefore, the transaction isolation actually consists of 
two components: the object level isolation within one applica- 
tion, and the database level isolation across, multiple applica- 
tions. Both isolation components address multiuser issues, be- 
cause one server application may also serve multiple clients, as 
in Figure 17. 

The conflict avoidance scheme for GUI-driven, long-running 
transactions is usually unacceptable from a performance per- 
spective. A conflict detection scheme where each transaction 
has a version of the concurrently accessed objects provides sig- 
nificantly better performance. However, managing multiple ver- 
sions of the same object can be fairly complex. 
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Figure 11: Two sets of encapsulated queries. 
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One approach for implementing an object versioning mech- 
anism is to divide business objects into two parts: a wrapper 
and a version (for example, an E/BObject and an EntityBean). 
When any object refers to a business object, it actually refers to 
its wrapper. The wrapper delegates the method invocations to 
the appropriate version, which contains the object's business 
behavior and instance data. When a business object is first ac- 
cessed (get/set a property) within a transaction, a new version 
of the object is added to the current transaction’s local registry. 
The new version is based on the version in the parent transac- 
tion’s registry. Figure 18 shows multiple object versions within 
a tree of nested transactions. 

Upon commit, the versions in a child transaction’s registry are 
merged with its parent transaction’s corresponding versions. If 
the transaction is a top-level transaction, the versions are also 
written into the database. The logic for detecting and resolving 
conflicts on merge is highly application dependent. The test may 
be as simple as comparing parent and child version numbers in 


._ where custno=456 and streetno=56 ... 





Example 1: Update statement with conflict detection predicates. 
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order to determine if the parent version has been changed af- 
ter the child version was created. For more advanced application- 
dependent testing the wrapper could have a conflict resolution 
call back method. 

On rollback the child versions are simply dropped instead of 
having to restore object states in the parent transaction. After 
rollback there is no trace that ei- 
ther the child transaction or the 
child versions ever existed. 

Many relational databases pro- 
vide little support for row-level 
conflict avoidance. With most 
databases the row-level locking 
is available only in conjunction 
with cursors. However, cursors 
may be of little use for an ob- 
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Most commercial databases have referential integrity (RI) 
constraints for maintaining the consistency of the database. 
These constraints require the database’s store and delete op- 
erations to be executed in a specific order. This order does 
not necessarily match the order in which the objects are cre- 
ated or deleted within an object application. Furthermore, the 
database RI constraints do not 
map to the logical object asso- 
ciations in a consistent way. RI 
rules are enforced based on the 
foreign-key references, which 
may have more than one possi- 
ble transformation when 
mapped to object associations. 
Manually coding the operation 
ordering is time consuming and 


ject application that is accessing O b ect pe rs] Sstence error prone, easily leading to un- 
and holding onto large numbers maintainable code. It is prefer- 
of different types of objects in a Cee able to defer execution of the 


random fashion. One trick for 

acquiring a row-level lock with- 

out a cursor is to touch a corre- 

sponding row (update a column 

without changing its value, for 

instance) when an object is first accessed within a transac- 
tion. If the row is already locked, the desirable action is of- 
ten to raise an exception instead of waiting for the lock to be 
released. 

As with object level isolation, the logic for detecting and re- 
solving database conflicts is application dependent. The two 
common conflict detection methods are either to reread and 
compare the database row to the modified object, or to add col- 
lision detection predicates (a set of attributes that constitute a 
conflict) to the where clause of the database update statement. 
Example 1 demonstrates conflict detection predicates. The up- 
date statement will fail if another user has changed the street 
number from its old value. 

Rereading and comparing rows is expensive and should be 
used sparingly, because it requires multiple database 
roundtrips — locking, reading, and updating the row. On the 
other hand, the use of conflict detection predicates is 
lightweight and works fine in most situations. More sophisti- 
cated detection schemes can be composed of combinations 
of the aforementioned commands. 
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operations and let the transac- 
tion automatically decide the or- 
dering upon its commit. 
The ordering algorithm utilizes 
the information of how the ob- 
ject associations are mapped to the primary-key/foreign-key 
column pairs in the database, and the integrity rules defined 
for the key columns. For each object within the transaction, 
the algorithm iterates over the associations the object has with 
other objects. For each association, the algorithm tests if the 
object has either insert precedence (if the object is to be in- 
serted) or delete precedence (if the object is to be deleted) 
over the association. If the object has a higher precedence, 
it will be moved accordingly in the transaction’s participant 
list. Due to the nature of relational RI constraints, the algo- 
rithm remains fairly simple, because there cannot be circular 
constraints defined in the database (otherwise it would be 
impossible to insert a row that has a prerequisite to its own 
prerequisite). 


API 

From the programming and maintenance point of view, the 
number of persistent constructs that appear in the application 
code should be kept as low as possible. Having a low number 
of persistence constructs introduces minimal intrusion upon 






Figure 13: Restructuring a relational result set and eliminating redundant entries. 
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the application, thus allowing the database and application to 
remain loosely coupled. This loose coupling between the 
database and application lets you design an object model that 
models the application domain as opposed to modeling the 
database design and vice versa. The persistent framework must 
be intelligent enough to perform many of the necessary per- 


























Fig 


Transaction tx = Transaction new(); 


x set Dept Frgn Key("D2") 


ure 16: Complex translation from an object association to multiple database relationships. 


sisting processes automatically, without instruction from the 
application. Implementing persistent constructs as first-class 
objects and. providing some of the persistence metainforma- 
tion at run time are two of the keys that make a successful 
persistence framework. The interfaces provided by the per- 
sistence API can be grouped into the following categories: 


¢ Business object interface. Protocol for accessing attributes from 
the business object. 

e Life cycle interface. Protocol for creating and destroying busi- 
ness object instances. 

e Finder interface. Protocol for finding business object instances. 

e Transaction interface. Protocol for creating, committing, and 
rolling back transactions. 


For example, the Enterprise JavaBeans (EJB) Specification de- 
fines interfaces that correspond to these categories. The remote 
interface for entity Beans corresponds to the business object in- 
terface. The EJB home interface has the same responsibilities as 
the life cycle and finder interfaces. The transaction interface is 
provided by the UserTransaction in the Java transaction pack- 
age, which is one of the prerequisites for the EJB. 





set Dept(dept2) (3) 





EmployeeHomeImp1 empHome = EmployeeHomeImp1.singleton() ; 


Employee emp; 


AddressHomeImp1 addrHome = AddressHomeImp1.singleton() ; 


Address addr; 

tx.begin() ; 

emp = empHome. findByKey ("1234") ; 

addr = addrHome.create(); 
addr.setStreet("123 Somewhere Dr."); 


//begi 


emp. setAddress (addr) ; 
tx.commit () ; 


Example 2: Sample persistence API code. 
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n a new transaction (transaction interface) 
//fnd an employee instance (finder interface) 

//create an address instance (factory interface) 

//set attributes of the address (bus.object interface) 


//set employee's address (bus.object interface) 
//commit the changes (transaction interface) 
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Figure 17: A typical system configuration in an enterprise 
environment. 





Figure 18. Multiple object versions within a tree of nested 
transactions. 


Example 2 demonstrates the use of the persistence API by re- 
trieving an employee object, creating an address object, associ- 
ating these two objects together, and committing the changes 
to the database. 


Conclusion 
The rationale for building an object persistence framework are, 
of course, increased productivity and reduced maintenance costs. 
Independence between object applications and databases allows 
enterprises to develop and maintain more complex applications 
and still leverage existing data management infrastructures. 
Implementing a full-blown object persistence framework 
easily represents several years worth of work. The more flex- 
ibility and performance that is required from the framework, 
the more complex the framework becomes. Yet almost any 
framework is better than no framework. Even a simple frame- 
work can help in structuring the code in a clean and logical 
way. For example, the mapping metainformation can implic- 
itly be represented as inlined code and the query objects can 
encapsulate handcrafted SQL strings. The areas worth spend- 
ing more time in creating generic components, however, are 
the associations and the transactions because they have a di- 
rect impact on the application programming model. There are 
also several commercial object persistence frameworks avail- 
able that are usually a viable alternative to in-house devel- 
opment, especially when the target application is complex and 
critical to the enterprise. 
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encapsulations of 
database objects 





Paul Lipton 


elational databases require a veneer 

of mapping code to translate the state 

of Java objects to their spreadsheet- 

like world of rows and columns. In 
object-oriented terms, the fundamental unit 
of manipulation for relational databases is 
the instance property, not the object. In 
other words, SQL programmers have to 
spend a lot of time thinking about the table 
column, which object-oriented program- 
mers think of as the instance property and 
not the object itself. In many ways, they 
are forced to work in a unique, artificial, 
two-dimensional world. Object-oriented 
developers, on the other hand, build com- 
plex three-dimensional networks of ob- 
jects in an effort to model the real world. 
This model, like the real world, explicitly 
represents complex one-to-many and 


Paul is director for object technology at 
Computer Associates and a technical rep- 
resentative to the Object Data Manage- 
ment Group. He can be reached at 
paul.lipton@cai.com. 
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many-to-many relationships, such as col- 
lections. 

Object databases, by definition, store 
and retrieve objects. The fundamental unit 
of manipulation throughout the applica- 
tion and the object database architecture 
is consistently the object. Thus, object 





so ci 


databases let you avoid the added com- 
plexity, risk of error, and overhead of map- 
ping code to translate between the world 
of objects, and the spreadsheet-like world 
of the relational database. 

How do object databases work with Java? 
In many cases, Java object persistence for 
object databases is handled effectively by 
an object-level Java language binding. Usu- 
ally, this binding is based upon the Object 
Data Management Group’s (ODMG) Java lan- 








guage binding specification. The ODMG is 
an industry consortium of object database, 
object-relational mapping software, and ap- 
plication server vendors. Unlike an older, 
API-based approach to databases such as 
JDBC, the ODMG’s tight language binding 
preserves the Java object model. 

Most importantly, this language binding 
lets Java programmers access a database 
without having to leave the Java universe; 
no SQL or deep understanding of database 
specifics is required. In fact, developers 
are usually expected to define database 
object schema within their client Java pro- 
grams as Java class definitions. This is of- 
ten extremely useful, especially when de- 
velopers are struck with the stunning 
realization that they need to store and re- 
trieve the state of certain Java objects at 
run time. This profound realization often 
occurs after much code has been written! 

However, as the ODMG approach to 
Java object persistence currently stands, 
there is no explicit mechanism for taking 
advantage of advanced object databases 
that can execute methods on the database 
server, as well as in the Java client. The 
ODMG standard assumes that methods 
are always executed on the client’s Java 
Virtual Machine (JVM). The ODMG bind- 
ings also do not have any special support 
for multimedia. Another concern is that 
typical users of the ODMG binding do 
not often maintain database schema by 
using the database’s native Object Defi- 
nition Language (ODL). Rather, you are 
usually expected to use Java to define 
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(continued from page 34) 
database object schema within your client 
Java programs. 

This might be acceptable, or even de- 
sirable, in an application system where 
all the code is written in Java, but Java 
developers often need to access and up- 
date database objects created by a wide 
range of applications already written in 
different languages, such as C++, Delphi, 
or Visual Basic. This is particularly true 
in large enterprises. Obviously, defining 
and maintaining the schema for such ob- 
jects in the source code of one particu- 
lar application, such as a Java program, 
is like the tail wagging the dog. It makes 
more sense, under these circumstances, 
to store and maintain database schema 
definitions using the database ODL. Java 
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proxy technology is the answer to the is- 
sues of explicit support of unique 
database features and centralized schema 
definition. 


Java Proxy Technology 

Java proxy technology can allow devel- 
opers to define database object schema 
using the database ODL. To illustrate how 
such a technology might be implement- 
ed, and to describe how to use such an 
approach, I will provide examples based 
on Jasmine, the object-oriented database 
from Computer Associates (the company 
I work for). 

Jasmine contains an object database 
with support for server- and client-side 
methods, along with a GUI development 
environment, and support for ActiveX, 
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HTML, C/C++, Java, and more. For those 
who choose to develop applications us- 
ing Java, Jasmine includes two compo- 
nents. One component, “pJ” (short for 
“persistent Java”), is a complete imple- 
mentation of the ODMG Java language 
binding, and includes support for the Ob- 
ject Query Language (OQL), an important 
part of the ODMG specification that ven- 
dors often fail to implement. It is the of- 
ficial ODMG way to do queries at the ob- 
ject level, and is similar to SQL-92 with 
object-oriented extensions. By supplying 
both pJ and OQL, Jasmine complies with 
the ODMG standard for Java. 

However, persistent Java is only one 
side of the equation. While designing Jas- 
mine, it became clear that it was also nec- 
essary to provide Java proxy support so 
that Java clients could use Jasmine’s ad- 
vanced features such as server-side meth- 
ods and integrated multimedia support. 
Also, it was desirable to allow definition 
of database object schema using Jasmine’s 
ODL, called Object Database Query Lan- . 
guage (ODQL), if required. In Jasmine, 
the Java proxy support is, appropriately, 
in the component called “Jp” (“Java 
proxy”), which comes with Jasmine. 

The idea of a Java proxy is that an ob- 
ject in the client JVM represents (or prox- 
ies) the object in the database itself. The 
Java proxy is an extremely thin, almost 
stateless object, well suited to Internet de- 
ployment. It completely encapsulates the 
object it represents in the database. 


Reverse Java Language Bindings 

The reverse language binding (generating 
database-aware Java classes from database 
classes) is based on Java proxy technolo- 
gy in Jasmine. To generate a Java proxy re- 
verse language binding class definition in 
Java, one or more database classes is se- 
lected for processing by a Java program, 
called “JPCG” (short for “Java Proxy Class 
Generator”), which generates Java source 
code. JPCG reads database schema for one 
or more database classes, and generates a 
100 percent pure Java class definition for 
each Jasmine class. Each Java class defini- 
tion is a JavaBean that is also a Java proxy. 
Any database class- or instance-level prop- 
erty can be retrieved and updated in the 
database by using the Bean’s get and set 
methods. 

Consider Listing One, which is a Cus- 
tomer Java object, bound to the database 
Customer class by JPCG. For this exam- 
ple, the Customer database class has been 
simplified to have only one property, and 
one class-level (static) method. The get 
and set methods in the Bean encapsulate 
the CustomerNumber instance property of 
Customer objects in the database. You can 
see how the database data types relate to 
Java data types in methods generated by 
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JPCG. In this example, there are two meth- 
ods available to map properties directly 
to Java primitives, required for a JavaBean. 
Such database NIL values are simply 
mapped to primitive values that make 
sense, like zero. 

The getLowCreditkisks method in List- 
ing One is also generated by JPCG and 
maps to the equivalent method in the 
database’s Customer class. It shows that 
calls to the methods of a JPCG generated 
JavaBean on the client automatically re- 
sult in the appropriate server-side meth- 
ods being called via that Bean’s proxy. 
On the client, the method’s parameters are 
marshaled and passed to the database ob- 
ject or class that the proxy is representing 
where the equivalent method is executed 
on the database server. The results of the 
method’s execution are returned to the 
client JVM. The Java proxy object on the 
client JVM appears to have executed the 
method locally. The calling Java object is 
never aware that the method was actually 
executed on the server. 

The advantage, of course, is that server- 
side logic can be secured and scaled as 
necessary, so that sensitive or processor 
intensive logic need not be downloaded 
to the client. Also, complex commercial 
libraries that would be too expensive or 
large to download to a client can be ac- 
cessed and used on the server. Jasmine 
allows compiled logic residing in shared 
libraries on the server to be used, result- 
ing in high-performance shared logic run- 
ning close to the data. 

The createCustomer method creates an 
object of type DBObject to serve as the 
underlying proxy for this Bean. Every 
JPCG generated Bean can access the meth- 
ods in its proxy object directly by using 
its toDBObject method. DBObject has two 
more methods that use or return Jasmine 
data types that are mapped to java.lang.* 
objects. This is so that NIL property val- 
ues in Jasmine can be mapped to a null 
value in Java. It is also possible, using a 
proxy’s isPropertyNil method, to ascertain 
if a property is NIL without retrieving its 
value. Special methods are available for 
each proxy to allow multiple properties 
to be obtained and changed for a database 
object. This helps minimize network traf- 
fic. Constructors, destructors, and various 
other utility methods are also available in 
d proxy. 


Dynamic Java Proxies 

The ability to statically bind a Jasmine 
database class directly to an equivalent 
generated Java proxy class using JPCG can 
be useful, especially for applications that 
depend on specific, known classes. How- 
ever, there are times when database ac- 
cess may need to be dynamic. Tools such 
as report writers may need to access class- 


Dr. Dobb’s Journal, May 1999 


es that did not even exist at the time that 
they were written. 

Jp lets a number of different Java proxy 
objects be created dynamically. Some of 
these objects proxy database objects or 
classes, while others encapsulate related 
database functions such as support for 
queries, dynamic database schema access, 
or database session and transaction con- 
trol. This dynamic portion of Jp is referred 
to as J-API. 

There are six basic classes in J-API: 
Database, DBObject, DBClass, DBCol- 
lection, ODQLStatement, and Database- 
MetaData. The Database class represents 
the database itself. Each Database ob- 
ject contains methods that can connect 
to the database, establish a database ses- 
sion, and establish either a client/server 





or three-tier connection to the database. 
Multiple Database objects can be in- 
stantiated within an application, each ful- 
ly encapsulating a client/server or three- 
tier session connection and session with 
a Jasmine database. Database transac- 
tions for each session are also controlled 
and their state is examined via Database 
instance methods, such as rollback- 
Transaction, that are defined for the 
Database class. 

A Database object also contains con- 
venient factory methods for all the other 
classes in J-API. The advantage of using 
the Database object’s factory methods, in- 
stead of the constructor for the other class- 
es in J-API like DBObject, is that the fac- 
tory methods allow the instantiation of 
proxies that are automatically associated 
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with the correct database session. Usu- 
ally, the less convenient alternative is to 
specify the Database object in a class’s 
constructor to create this association. This 
encapsulation of the database session, 
and all other aspects of the Jasmine 
database, is an important principle in the 
design of Java proxies. To maximize per- 
formance, Jp is designed and optimized 
to fit tightly with the Jasmine object 
database server architecture. 

The proxy classes that encapsulate ac- 
tual database objects are, of course, of pri- 
mary concern here. The DBObject and 
DBClass classes are merely instance-level 
and class-level (static) versions of the same 
basic idea. Each of these classes has the 
same kinds of methods. An instance of 
DBObject encapsulates the instance-level 
database object’s properties and methods 
while an instance of DBClass does the 
same for a database object’s class-level 
properties and methods. 

Since it doesn’t really matter which of 
these two proxy classes I choose, let’s look 
at DBObject. To be a proxy, an instance 
of DBObject must be associated with a 
database object. This is done by setting a 
DBoObject to the object identifier (OID) of 
a particular object in the database. An OID 
is a logical value that uniquely identifies 
an object in the database. An object’s OID 
is immutable in Jasmine. In Jp, an OID is 
represented as a String. 

In designing Jp, an effort was made to 
provide a rich set of methods for DBOb- 
Ject creation and association with Jasmine 
database objects. For example, it is pos- 
sible to use the earlier mentioned factory 
methods to instantiate a database object 
on the database server, and associate it 
with a newly created Java proxy object 
on the client JVM at the same time. The 
method in Example 1 would accept the 
name of a database class family (similar 
to a package in Java) and class name, and 
return a DBODject that proxies a newly 
created database object. Once the database 
transaction is committed, that object be- 
comes persistent within the database. A 
DatabaseException, extended from 
java.lang.Exception, can be thrown for 
any proxy method that can return a 
database or communication error. Of 
course, methods exist to obtain error text, 
as well as to determine the type of error. 

Listing Two is a fully functional Java pro- 
gram that creates a new Customer object 
in the database, and proxies it with a new- 





ly created Java proxy assigned to the ref- 
erence variable dbo. Once the program 
commits the database transaction with the 
endTransaction method, the newly creat- 
ed Customer object is permanent unless 
explicitly deleted. Had the rollbackTrans- 
action method been called instead, the 
database would be brought back to the 


Using a Java proxy 
approach to object 
databases is 
straightforward and 
understandable 


state it was originally, as if the create- 
DBObject method had never been invoked. 

A variation of the createDBObject 
method allows the specification of prop- 
erty values at database object creation time. 
The values are passed in as key-value pairs 
using a hash table. The getDBObject 
method accepts an OID, and returns the 
appropriate type of J-API object: DBOb- 
ject, DBClass, or DBCollection. 

It is important to be able to dynamically 
change the database object that a DBOb- 
ject is proxy to. The setOID method allows 
a DBObject to be a proxy to more than 
one database object in its lifetime, thus 
avoiding the overhead of Java object cre- 
ation. The getOJD method returns the OID 
for an instance of DBObject. 


Collections and Server-Side Logic 

The DBCollection class encapsulates the 
concept of a collection (array, bag, list, 
set). Collections can contain objects or 
atomic literals. They can be the proper- 
ties of persistent database objects (and 
thus persistent themselves), or they may 
be temporary. 

The idea of a temporary collection is 
tied to how Jasmine implements a 
database session. In Jasmine, the concept 
of a database session is more than just a 
connection between client and server. 
The Jasmine engine is multithreaded, and 
each database session has associated with 


Example 1: This method accepts the name of a database class family and class 
name, and returns a DBObject that proxies a newly created database object. 
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it an execution thread that contains its 
own server-side run-time environment. 

Each session run-time environment is 
where server-side logic executes for a 
client. Thus, when server-side methods 
are called from a Java proxy, the method 
runs in its own thread on the server. The 
ODQL interpreted database language is 
always available in the server as part of 
the database session thread. The language 
is similar to C++ with extensions that make 
it object database aware. Its usefulness 
comes from the unique synergy of its three 
identifying characteristics: It is a true 
database language, computationally com- 
plete, and object oriented. 

Listing Three contains Java client code 
that calls a compiled server-side method 
called getLowCreditkisks. Jasmine allows 
server-side methods to be compiled in 
ODQL and/or C++ in the Jasmine server. 
Compiled ODQL is merely preprocessed 
into C++, so that ODQL methods can take 
advantage of compiler optimization tech- 
nology from any of the popular C++ com- 
pilers for various platforms, such as Win- 
dows NT or Solaris. 

This server-side method can do a lot of 
filtering and refinement, so that the client 
need only receive a small subset of total 
objects available in the database. This is 
in stark contrast to older object database 
technologies that require all objects to be 
moved to the client for processing. This 
was a key design goal for Jasmine allow- 
ing network traffic to be considerably min- 
imized. Scalability is also improved be- 
cause the power of the server, obviously, 
is more under our control than the client 
may be. 

Listing Three uses the Java proxy 
method called execMethod to invoke a 
server-side method called getLowCredit- 
Risks. The proxy’s execMethod logic mar- 
shals the parameters, if any, and passes 
them on via RMI to the class-level method 
on the server called getLowCreditRisks. 
The getLowCreditRisks method on the serv- 
er does some work and returns a tempo- 
rary collection on the server, which is 
proxied by the DBCollection called dblist 
on the JVM. The dbclass proxy method 
called execMethod returns a DBCollection 
to the dblist variable, which proxies the 
returned temporary collection that re- 
mained on the server. 

The DBCollection proxy class allows enu- 
meration of the collection. Only during enu- 
meration are any objects actually brought 
down from the server to the client. Objects 
are brought down in blocks. The size of 
these blocks can be programmatically con- 
trolled in order to let you optimize both 
network traffic and JVM memory utilization. 

Because DBObject fully supports the 
DataSource interface of JMF (lava Media 
Framework), I can access each customer’s 
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picture by simply streaming the image. 
JMF is a Java Standard Extension for the 
JavaBeans Activation Framework. The 
Datasource interface fully encapsulates the 
underlying data in an object and provides 
a common interface to access it. The meth- 
ods are very simple to use. The getCon- 
tentType method returns the MIME type 
of the data in the object. The DataSource 
interface consists of: 


InputStream getInputStreamQ); 
OutputStream getOutputStreamQ; 
String getContentTypeQ); 


I use DBObject’s getInputStream method 
in Listing Three to stream in the bitmap 
photograph of each customer from the 


Listing One 


// Machine generated 
package jp.jasmine.CAStore; 


import jp.jasmine. japi.*; 
import java.math. BigDecimal; 
import java.sql.Date; 

import java.rmi.*; 

import java.util.*; 


public class Customer extends Person { 
protected static DBClass dbc; 


database. First, I make the dbo variable 
refer to a DBObject proxy of the mullti- 
media object referred to by the customer 
object’s photo property by using the get- 
Property method. Then, I invoke the 
DBObject’s getInputStream method to 
open an input stream for the multimedia 
that is simply streamed out to a disk file. 
Any multimedia data can be so streamed. 


Conclusion 

Using a Java proxy approach to object 
databases is straightforward and understand- 
able. Both the ODMG language binding 
and the Java proxy technology offer a high- 
er level, more natural way of dealing with 





For Java applications that do not have 
to fit into an existing infrastructure of ap- 
plications written in other languages and 
are designed to treat the database as a 
passive object repository with all the log- 
ic in the client JVM, the ODMG Java bind- 
ings are a good fit. 

Java proxy technology can add signifi- 
cant value when the applications are based 
on a thin client approach with database 
objects that support server-side logic as 
well as object state. They also make sense 
for applications that plan to make heavy 
use of multimedia, or that need to fit into 
existing multilanguage solutions. 


database objects than the JDBC API. DDJ 
System.out.println("test error: " + ex.toString()); 
ex. printStackTrace() ; 
; } 
} 
Listing Three 


import jp.jasmine.japi.*; 


import java.io.*; 
import java.util.*; 


public Customer(Database db, String oid) throws DatabaseException { 


super (db, oid) ; 
} 
public Customer() throws DatabaseException { 
} 


public static Customer createCustomer (Database db) 
throws DatabaseException { 


Hashtable arg = new Hashtable(); 


DBObject dbo = db.createDBObject("CAStore", "Customer", arg); 


return (Customer) toApplicationObject (dbo) ; 
} 


private static void setDBClass(Database db) throws DatabaseException { 


if (db==null) throw new 


DatabaseException("Null database in setDBClass()") ; 
if (dbc==null) dbe = db.getDBClass("CAStore", "Customer") ; 


| f sncssesr re reer os Properties <-ss<<<2<--=sse== 
public int getCustomernumber() throws DatabaseException { 
dbo. getIntProperty ("customernumber") ; 


return (dbo==null) ? @: 
} 


public void setCustomernumber (int customernumber) 


throws DatabaseException { 
if (dbol=null) dbo.setProperty("customernumber", customernumber) ; 


| f woecere--=--------=- Methods <<<----=-+--=-=+----= 
public static DBCollection getLowCreditRisks( Database db ) 
throws DatabaseException { 


if (dbc==null) setDBClass (db) ; 

Object arg[] = (}; 

Object o = null; 

if (dbc!=nu11) 
o=dbc.execMethod("getLowCreditRisks", arg, 


public class example3 


if 


public static void main(String args[]) { 


try { 
int aChar; 


Database db; 
DBObject dbo; 


String customerName; 


db=new Database("pcl.cai.com", 1099); 
db. startSession(); 
db. startTransaction(); 


DBClass dbclass = 


// 1099 is the port number 


// Create a proxy for new temporary database collection of customers. 
DBCollection dblist 
db.createDBCollection("List", "CAStore::Customer", 
// Proxy the customer class (CAStore is like a Jasmine "package"). 
db.getDBClass("CAStore", "Customer") ; 


null); 


// Call a server-side method to query, call credit bureaus, etc. 


dblist = (DBCollection) dbclass.execMethod("getLowCreditRisks", 


null, "Bag <CAStore::Customer>") ; 


// Print name of each low credit risk customer & dump photo to disk. 
Enumeration e = dblist.elements() ; 
for (; e.hasMoreElements(); ) { 

dbo = (DBObject) e.nextElement () ; 

customerName = (String) dbo.getProperty ("name") ; 


dbo = (DBObject) dbo.getProperty ("photo") ; 


if (dbo == null) 


System.out.println("Name: " + customerName + 
" image not available!"); 
else { 
"List<CAStore: :Customer>") ; 
System.out.println("Name: " + customerName + " File: " 


return (DBCollection) o; 


Listing Two 


import jp. jasmine. japi.*; 
import java.io.*; 


public class example2 
{ 
public static void main(String args[]) { 
try { 
Database db; 


// Connect to database as 3-tier application and start a session. 


db=new Database("pcl.cai.com", 1099); 
db.startSession(); 
db.startTransaction(); 


+ customerName + ".bmp") ; 


// Stream the multimedia object that we proxy to disk. 
InputStream is = dbo.getInputStream() ; 
OutputStream fos = new BufferedOutputStream(new 


FileOutputStream(customerName + ".bmp")) ; 


while((aChar = is.read()) >= 0) 
fos.write(aChar) ; 


fos.close(); 
is.close(); 


} 
} 


db.endTransaction() ; 


db.endSession(); 
} 


// Simple exception handling. 
catch( Exception ex) { 


System.out.println("test error: " + ex.toString()); 
ex. printStackTrace() ; 


// Create a database object and proxy it in one call. } 
DBObject dbo = db.createDBObject("CAStore", "Customer") ; } 


db.endTransaction() ; 
db.endSession() ; 
} 
// Simple exception handling. 
catch( Exception ex) { 


Dr. Dobb’s Journal, May 1999 


DDJ 


39 





VBScript and SQL 


Calendars 





Building a web calendar 





John Donovan Lambert 


t DrMag.com (an Internet-based re- 
source center for magazines that 
lets you search for and subscribe 
to nearly 2000 publications), we 
have reduced printing costs and increased 
communication by developing our in- 
tranet as a replacement for distributing 
information by paper. More information 
gets distributed this way, and it becomes 
available to every employee instantly, 
which is especially helpful in an en- 
trepreneurial firm like ours. Our intranet 
primarily uses VBScripts in Active Serv- 
er Pages (ASP), with data stored in Mi- 
crosoft SQL 7 databases— the same 
technology we use with the public Dr- 
Mag.com site. 

We put off creating intranet calendar- 
based reports for day-to-day analysis be- 
cause creating such a web page was a lit- 
tle less than intuitive. Once you have the 
structure for one such report, however, prac- 
tically anything can be output with only 
simple modifications to the query. There 
are numerous issues that I'll discuss here 
for putting SQL results in a web calendar, 
but it boils down to three main tasks: 





John is the ClO of DrMag.com. He can be 
reached at lambert@drmag.com. 
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e Structuring a SQL query so that the re- 
sults can easily be assigned to a calen- 
dar’s cells. 

e Writing scripts to dynamically create a 
calendar. 

e Inserting the SQL results into the cal- 
endar. 





In addition to presenting the exact VB- 
Scripts I use, Ill discuss enough of the 
logic to make it straightforward for you to 
port it to Java, Perl, Cold Fusion, or what- 
ever language you prefer. 


The Query 

First, the query will need to report the 
month, the day of the month, and the 
day of the week, for each piece of in- 
formation to be displayed in the calen- 





dar. If you want multiple years on one 
web page, you have to return the year 
as well. At least one table in each query 
must have a date field, and in most cas- 
es, it should not contain nulls. SQL cal- 
culates these columns by the DATEPART 
function for T-SQL (Microsoft and 
Sybase), and by TO_CHAR and Format 
for Oracle. DB2 uses three separate 
commands. Listing One covers some of 
the syntax for these functions, but for 
full syntax or other flavors of SQL, check 
your documentation. (By the way, IBM 
provides free access to over 400 DB2 
books online, including its SQL refer- 
ence, at http://www.software.ibm.com/ 
data/db2/udb/library.html. Thanks, IBM.) 

Listing Two is an example of a useful 
query and Example 1 shows a few rows 
of its results. By the way, if you’re work- 
ing with both SQL and web-page script- 
ing, get used to seeing overlapping terms 
for database objects: Columns and fields 
mean the same thing, and so do rows and 
records. 

The UnitsSold table has a date column 
(RecDate) and information such as keys 
to other tables to identify which item was 
purchased, who the sales person was, 
who purchased it, and so on, and it has 
one record for each unit sold. This prod- 
uct group has many units sold each day, 
so there will be many records with the 
same day in the date column. In the se- 
lect list, three of the four columns returned 
deal with the date, and only one is the 
data we actually want to present in our 
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(continued from page 40) 

calendar. This is provided by the aggre- 
gate COUNT function on the OrderID col- 
umn. Since there is one record for each 
sale, a count of the records for each day 
returns the total number of units sold for 
each day (if any). 

The @Year is a T-SQL variable in my 
stored procedure, named sp_UnitsSold. If 
you're using a SQL database, as opposed 
to something like Microsoft Access, be sure 
to put your query in a stored procedure 
and have the web-page script pass the 
year as an input variable. (See Listing 
Three for a VBScript call to a stored pro- 
cedure with a passed variable.) The per- 
formance boost is well worth it. Other- 
wise, you will need to replace the SQL 
variable with a VBScript variable; see Ex- 
ample 2 for an instance of that type. 

If you have records with more than one 
year in the date column, you must include 
the WHERE clause to restrict the results 
set to show records from only one year, 
or add the year to the select list, group 
by, and order by statements to show 
records from multiple years. 

In the Example 1 results, the first record 
shown is for Friday, since that is the sixth 
day of the week. This may vary, depend- 
ing on your SQL server’s configuration. 


The Calendar 

STDCAL.ASP (available electronically; see 
“Resource Center,” page 5) is code for a 
web page that displays a standard calen- 
dar. You can use this as a reference for 
the HTML/VBScript that creates the cal- 
endar without being cluttered with pre- 









Example 1: Sample results. 


Example 2: Troubleshooting by outputting variables to HTML. (a) Original, 


(b) change to. 
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senting query results. This HTML table cre- 
ates eight columns, a column on the left 
to hold the name of the month, and one 
for each day of the week. A live example 
of this page is available at http://www 
.drmag.com/calendar.asp. 

The HTML form posts back to itself so 
that the person viewing it can change the 
year displayed. Scripting languages have 
their own valid ranges of years, and your 
SQL may have a different range from your 
scripting language. The reason this code 
checks both request.querystring and. re- 
quest.form is that request.querystring 
checks for the parameters in the URL, and 
request.form looks for them in the HTTP 
header, where they will be if the POST 
method of an HTML form has been used. 
Incidentally, if you’d like an easy way to 
see the entire HTTP header, you can print 
it to a web page using <%=request.server- 
variables(ALL_RAW)%>. 

The basic logic is to determine the year 
and month the code will work with next, 
determine the day of the week that 
month’s first day falls on, determine the 
number of weeks in that month, and use 
variables to keep up with the values as 
they change. You also have to calculate 
the length of the month by setting a date 
variable to the first day of the next month 
and subtracting one day. (An IF/ELSE 
statement handles December differently. 
And yes, you use the DateAdd function 
to subtract a day. Present the name of the 


month if it’s the first line for the month, 


and if the first day of the month is not on 
Sunday, fill in blank cells up to the first 
day, with a SELECT CASE. Then fill in the 





. 


pera 





days of the month (1-31) for the first week 
ending on Saturday. Start the next row and 
fill in the days of the month until your 
day counter equals the last day of the 
month. If the last day is not Saturday, fill 
in blank cells for the rest of the week. Re- 
peat for the next month while any months 
remain. 

The main HMTL table has one fixed 
row of column headers, and then we 
have our first VBScript loop, in this case 
a FOR/NEXT, since it’s going to present 
exactly 12 months. Use a DO LOOP if 
you want a conditional number of 
months presented. A month counter vari- 
able (mc) identifies the month about to 
be processed. 

To track the number of weeks in a par- 
ticular month, set the variable Num Weeks 
to 5 for the default, and use an IF state- 
ment to change it to 6 if the beginning 
weekday of the month (bwdom) is 6 and 
the last day of the month Udom) is 31, or 
if the beginning weekday of the month is 
7 and the last day of the month is equal 
to or greater than 30. An additional IF 
checks to see if the month is February, 
not a leap year, and the first day of the 
month is Sunday, the only case in which 
NumWeeks is set to 4. 


The Final Form 

Listing Three takes the HTML/VBScript of 
STDCAL.ASP, and with a few modifica- 
tions, inserts an ODBC connection, query, 
and presentation of the query results. 
Once you have the query returning results 
correctly, and the HTML table presenting 
the calendar correctly, it’s fairly easy to 
look at Listing Four and see where to plug 
in the query and its results. 

Unlike STDCAL.ASP, this code uses a 
DO UNTIL to make the presentation of 
months conditional. If no records are re- 
turned for some months, those months 
won't be displayed. 

The biggest difference is the addition 
of the DIM D(@1) statement, which cre- 
ates a variable array. Unless declared oth- 
erwise, VBScript variables are the variant 
datatype. This array gets its values assigned 
with an initialization loop inside the month 
loop. Then, as you loop through the HTML 
table to create a cell and put the day of 
the month in it, the day of the month vari- 
able corresponds to the array variable. For 
example, the third day of the month will 
have its SQL results stored in D(3). When 
the code creates a day’s cell, you just print 
the value inside it with <%=D(Day- 
Counter)%>. 

If you need to troubleshoot, you can 
break long scripts and output variable con- 
tents as HTML, as in Example 2. This tech- 
nique is especially useful when you’re 
combining text strings with variables, to 

(continued on page 40) 
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(continued from page 42) 

make sure your concatenation and other 
syntax is coming out correctly. Of course, 
run your query with your normal query 
tool first, to make sure it outputs the 
records correctly. 


Single-Month View Variation 

A seemingly infinite number of variations 
can be made for the HMTL table layout 
and the time period covered by it, like 
two months side-by-side, or your com- 
pany’s current fiscal quarter. Most use- 
ful if you have a large amount of data 
to display for each day, such as text as 
opposed to amounts, is a single-month 
or single-week web page that will have 
much larger cells for each day (for an 
example, see http://www.drmag.com/ 
month.asp). 

Listing Four is a stored procedure you 
could use with a single month, so you 
can compare it with Listing Two’s query 
for an entire calendar year. Example 3 
shows sample results. MONTH.ASP 
(available electronically) is HTML code 
for a single month view, with a query 
and its results inserted. The query in this 
example should be a stored procedure 
if your database provides that function- 
ality, but I’ve done it this way to show 
users of Access and other small DBs an 
example of making a direct query, with- 
out a stored procedure. (By the way, if 
you are using Access in particular, don’t 
expect it to handle more than a few si- 
multaneous connections without slow- 
ing to a crawl. Consider upgrading to 
Microsoft SQL7, which can handle mul- 
tiple simultaneous connections much 
faster than Access can handle one.) 

The main differences from Listing 
Three are tracking the month variable 
the same way the year variable is 
tracked, and modification of the query 
to return data for only a single month. 
Also, in the calendar’s cell where the day 
of the month and query results are dis- 
played, I prefer to use a nested HTML 
table to maximize layout control, while 
retaining the broadest possible browser 
compatibility. 

To improve the usability, at the bottom 
of the month display, I use hyperlinks to 
page to the previous and next months. 
The hyperlink passes the month and year 
variables in the URL instead of the HTTP 
header. If the month displayed is Febru- 
ary through November, just increment or 
decrement the month. For January, the 
Previous link has to decrement the year 
and set the month to December, and for 
December’s Next link, the year has to be 
incremented, and the month set to Jan- 
uary. An IF/ELSEIF/ELSE statement han- 
dles this. Below that is an HTML form with 
a SELECT for the month and an input text 





box for the year, so users can jump to any 
month and year. 

Last, you'll notice that the HTML SE- 
LECT for the month is over 60 lines of 
code because of the IF/ELSE statements 
used for each month. This is only nec- 
essary if you want the month displayed 
to be the default selection. You can short- 
en your code if you skip choosing the 
default month in this select, but it won't 
speed it up much. Other variations in- 
clude: 


e You can make a clock web page for re- 
ports segmented into hours, minutes, or 
even seconds. Just check your SQL doc- 
umentation for the syntax on the 
DATEPART functions. 

e You can vary the query, and therefore 
the results, by login ID for pages that 
are password protected. VBScript pro- 
vides access to the HTTP header in- 
formation through Server Variables, and 
the headers include user IDs when a 
password scheme is used. For exam- 
ple, <%IF INSTR(Request.servervari- 
ables("LOGON_USE"), "drdobb") THEN 
... %> is one way to start the logic. Use 
an INSTR function if the web server is 
on Windows NT, because the LOGON 
_USER variable may include the do- 
main name, depending on whether the 
user logs into the web page from in- 
side or outside the domain LAN. With 
VB and VBScripting, INSTR doesn’t re- 
quire an operator such as “> 1” be- 
cause if there is no hit, it returns zero, 
which equates to False, and any hit at 
all returns an integer of one or more, 
and equates to True. 

You can add a column on the right to 

display totals for the week by several 

methods. Constructing a single query 
to return daily and weekly totals re- 

quired a fairly complex query (and I 

thank the friends who helped), and 

gave the slowest overall performance. 

I haven’t found a SQL language yet 

with a week-of-month function to 

make such a query easier. A second 
method is to run a separate query for 
the weekly totals, but the best perfor- 

mance in my tests was simply using a 

VBScript variable to count each day’s 

value, display it after Saturday was 

counted, and reset it to zero before 
counting the next week. 


Presenting 

Anniversaries and Holidays 

Think twice before adding holidays to web- 
based calendars if you intend for it to have 
international appeal. Political holidays vary 
by nation, and religious holidays are nev- 
er appreciated by everyone. If you want 
to present holidays on a limited-culture in- 
tranet, however, here are some examples 
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that will help you figure out how to pre- 
sent any holiday. 

The easiest feature to present is an an- 
niversary or holiday that always occurs on 
the same day of the same month, such as 
New Year’s Day (for the Julian calendar), 
which always falls on January 1. In the 
loop where the number of the day is pre- 
sented, simply use an IF statement to 
check to see if the month being present- 
ed is equal to 1, and if the day counter is 
equal to 1. If so, then add your holiday 
text label or graphic before the END IF, 
like <%IF wMonth = 1 AND DayCounter 


1 THEN%>&nbsp; <TD><FONT size = 
2>New soci Peay ONT %END IF%>. 





Slightly more complex is a holiday such 
as Thanksgiving Day, which is always the 
fourth Thursday in November (in Amer- 
ica). Because the month can begin on 
any one of seven days of the week, it is 
possible for the fourth Thursday to be in 
the fourth or fifth week (when the month 
begins on Friday or Saturday). In the IF 
statement, add AND DayOfWeekCounter 

5 to include only Thursdays, and 
change the DayCounter to check a range, 
with DayCounter > 21 AND DayCounter 
< 29. If you’re looking for the first fourth- 
particular day of the week (/day), such 
as the fourth Thursday, follow these 
guidelines: 





e First *day of the month, change IF state- 
ment to DayCounter < &. 

¢ Second /day of the month, change to 
DayCounter > 7 AND DayCounter <15. 

¢ Third "day of the month, change to Day- 
Counter > 14 AND DayCounter <22. 

e Fourth 7day of the month, change to 
DayCounter > 21 AND DayCounter <29. 


The rule is a little different if you want 
the last 7”day of the month, such as Memo- 
rial Day being the last Monday in May (in 
America). It isn’t the same as the fourth 
?day of the month, because months vary 
in length. Here’s the rule for this condition 
for every month except February, de- 
pending on how many days the month has: 


e 31 days, use DayCounter > 24. 
e 30 days, use DayCounter > 23. 


If February, because of periodic leap 
years, use the last-day-of-month variable 
(/dom) in Example 4. 

Holidays based on lunar schedules are 
the most complex, and you'll have to use 
an algorithm to relate the lunar schedule 
to the Julian calendar. For example, here 
is how I handle Easter (Christian), which 
is the first Sunday following the “Paschal 
Full Moon” and can occur in March or 
April (based on Carter’s algorithm, see 
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http://www.ast.cam.ac.uk/pubinfo/leaflets/ 
easter/easter.html). I have checked its ac- 
curacy against posted dates for Easter for 
a number of years past, and it always 
agreed. Right after initializing DayCounter 
and WeekCounter, before the loop that cre- 
ates the cells for each week, I initialize two 
new variables, eMonth and eDay, to iden- 
tify the month and day of the month for 
Easter in a given year; see Listing Five. 


Then, inside the loop where all the oth- 
er anniversary and holiday IF statements 
appear, I add <%IF wYear > 1699 AND 
wyYear < 2200 AND abs(wMonth) = eMonth 
AND DayCounter = eDay THEN %>Easter 
<%END IF%>. 1 used the absolute func- 
tion (abs) with wMonth because the vari- 
ant datatype wouldn't match with eMonth 
otherwise, even though DayCounter will 
match eDay without it. A bug, or a fea- 


ture? At any rate, you can see lunar-based 
special days can be difficult to implement 
in the Julian calendar. If you have to do 
much of this, show this sample to your 
boss and ask for a raise! Otherwise, have 
fun making web calendars, and remem- 
ber that next year is always better. 


DDJ 





Listing One 


Return an integer representing a specific part of a date: (a) Transact SQL 


<TD></TD><TD></TD><TD></TD><TD></TD><%DayOfWeekCounter = 5 


CASE 6 %> 


(Microsoft and Sybase) DATEPART (datepart, date); (b) Oracle. TO_CHAR(date, 


'datepart') FORMAT; (c) DB2. 


(a) 


Parameter Result Type Digits Returned 
yyyy year 1753-9999 
mm month i=12 
dd day of month 1-31 
dw day of week 1-7 (Sun.-Sat.) 
Parameter Result Type Digits Returned 
YYYY Year 1-9999 
MM month 01-12 
DD day of month 01-31 

day of week 01-07 


id 


YEAR(date) for year as 1-9999 
MONTH(date) for month as 1-12 
DAY(date) for day of month as 1-31 
DAYOFWEEK(date) for day of week as 1-7 


Listing Two 


CREATE PROCEDURE sp_UnitsSold @Year int AS 


SELECT DATEPART(mm, RecDate) AS Month, DATEPART(dd, RecDate) AS DayOfMonth, 
DATEPART(dw, RecDate) AS WkDay, COUNT(OrderID) AS UnitsSold 


FROM UnitsSold 
WHERE DATEPART(yyyy, RecDate) = @Year 


GROUP BY DATEPART(mm, RecDate), DATEPART(dd, RecDate), DATEPART(dw, RecDate) 


ORDER BY DATEPART(mm, RecDate), DATEPART(dd, RecDate) 


Listing Three 


<%@ LANGUAGE=VBScript %> 

<*wYear=Request .QueryString ("wYear") 
IF wYear = "" THEN wYear=Request.Form("wYear") 
IF wYear = 


<BODY> <CENTER> 

<form action="calendarunitsales.asp" method="post"> 
Enter a year for report:<br> 
<input size=6 maxlength=4 name=wYear><br> 


<input type="submit" name="Change" VALUE="Change"> 


</form><BR> 

<H2><%=wYear%> Unit Sales</H2><P> 

<% 

dim D(31) 

Set conn=server.createobject ("ADODB. connection") 
conn.Open "DATABASE=[database name] ;DSN=[DSN Name] ; 


UID= [login] ; Password=[password] ;" 


%> 


<TABLE ALIGN=center WIDTH=30% BORDER=1 CELLSPACING=1 CELLPADDING=2> 


<TR> 
<TD ALIGN=middle></TD> 
<TD ALIGN=middle><STRONG>Sun. </STRONG></TD> 
<TD ALIGN=middle><STRONG>Mon. </STRONG></TD> 
<TD ALIGN=middle><STRONG>Tue. </STRONG></TD> 
<TD ALIGN=middle><STRONG>Wed.</STRONG></TD> 
<TD ALIGN=middle><STRONG>Thu. </STRONG></TD> 
<TD ALIGN=middle><STRONG>Fri.</STRONG></TD> 
<TD ALIGN=middle><STRONG>Sat.</STRONG></TD> 
</TR> 
<% 
SET rsi=conn.Execute("EXECUTE sp_UnitsSold " & wYear) 
DO UNTIL rsi.EOF 'CONDITIONAL MONTH LOOP 
mc = rsi("Month")%> 


"" OR wYear < 1753 OR wYear > 9999 THEN wYear 
DATEPART ("yyyy",now()) %> 
<html><HEAD><TITLE>Calendar Report of Units Sold</TITLE></head> 


CASE 7 %> 


<TD></TD><TD></TD><TD></TD><TD></TD> 


iT] 
ov 


<TD></TD><%DayOfWeekCounter 


<TD></TD><TD></TD><TD></TD><TD></TD><TD></TD> 


CASE ELSE %> 


i} 
—~s 


<TD></TD><%DayOfWeekCounter 


<TD>Beginning Day of Week Error</TD><% 


END SELECT 


"Determine last day of month & number of weeks 
ldom = Day(DateAdd("d", -1, mc + 1 & "/1/" & tempyear) ) 


6 AND ldom = 31) OR 


(bwdom = 7 AND ldom > 29) THEN NumWeeks = 6 


NumWeeks = 5 

IF (bwdom = 

‘INITIALIZE DAY ARRAY 

le = @ 

FOR le = 1 to 31 
D(lc) = "&nbsp;" 

NEXT 


‘loop through records for the month & assign to D array. 
DO WHILE rsi("Month") = mc 'DAY ASSIGNMENT LOOP 
TempD = rsi1("DayOfMonth") 


D(TempD) = 


rs1("UnitsSold") 


IF NOT rsi.eof THEN rsi.movenext 
IF rsi.eof then exit do 


LOOP 
DayCounter = 1 
WeekCounter = 1 


DO WHILE WeekCounter < NumWeeks + 1 
DO WHILE DayOfWeekCounter < 8 
IF DayCounter < ldom + 1 THEN %> 


<TD ALIGN=middle><SUP><FONT size=-2><%=DayCounter%> 


</FONT></SUP>&nbsp;<FONT color="f££0000"> 
<STRONG> <%=D (DayCounter) %></STRONG></FONT></TD> 


<%ELSE%> 
<TD ALIGN=middle></TD> 
<%END IF 
DayOfWeekCounter = DayOfWeekCounter + 1 
DayCounter = DayCounter + 1 
LOOP 
DayOfWeekCounter = 1 


<%LOOP 
LOOP 
rsi.Close 
conn.Close%> 
</TABLE> <P> 
</CENTER></BODY></HTML> 


Listing Four 


= WeekCounter = WeekCounter + 1 %> 
</TR><TR><TD></TD> 


CREATE PROCEDURE sp_SalesRecords @Year int, @Month int AS 
SELECT DATEPART(dd, RecDate) AS DayOfMonth, DATEPART(dw, RecDate) AS WkDay, 


FROM SalesRecords 


WHERE DATEPART(yyyy, RecDate) 
AND DATEPART(mm, RecDate) 


DaysTopCustomer 


@Year 
@Month 


GROUP BY DATEPART(dd, RecDate), DATEPART(dw, RecDate) 
ORDER BY DATEPART(dd, RecDate) 


Listing Five 


<% IF wMonth = 3 OR wMonth = 4 THEN 'Begin Easter calculation 


eMonth = @: 


eDay = 0: vi = 


@: v2 = @: v3 = @: v4=0: 
v5 = @: v6 =@: v7 =@ 


IF wYear > 1699 AND wYear < 1800 THEN v6 = 23: v/=3 


IF wYear > 1799 AND wYear < 190@ THEN v6 = 23: 
IF wYear > 1899 AND wYear < 2100 THEN v 
IF wYear > 2099 AND wYear < 2200 THEN v 


6 = 24: 
6 = : 
wYear MOD 4: v3 = wYear MOD 7 


<TR> vi = wYear MOD 19: v2 = 
<TD ALIGN=middle><STRONG><%=MonthName (1c)%></STRONG> </TD> v4 = ((19*v1)+v6) MOD 30 
<% 'Determine day of the week the month begins on v5 = ((2#*v2)+(4*v3)+(6*v4)+v7) MOD 7 
tempdate = mc & "/1/" & wYear eDay = (22+v4tv5): eMonth = 3 
bwdom = datepart("w", tempdate) IF eDay > 31 THEN 
DayOfWeekCounter = @ 'This "week" has 8 "days" to eDay = (v4tv5-9): eMonth = 4 


! include the Name of the Month column. 


‘PRINT LEADING BLANK DAYS 
SELECT CASE bwdom 
CASE 1 
DayOfWeekCounter = 1 
CASE 2 %> 
<TD></TD><%DayOfWeekCounter = 2 
CASE 3 %> 
<TD></TD><TD></TD><%*DayOfWeekCounter 
CASE 4 %> 


IF eDay 
IF eDay 
END IF 
END IF 
END IF 


<TD></TD><TD></TD><TD></TD><%*DayOfWeekCounter = 4 


CASE 5 %> 








IF eDay > 24 THEN 
26 THEN eDay = 19 
25 AND v4 = 28 AND vi > 10 THEN eDay = 18 


'End Easter calculation %> 


DDJ 
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The CVS Data 





Efficient storage 
and retrieval of 
geographical data 





Cesar A. Gonzalez Perez 


s a computer specialist working with 
archaeologists, I've found many ar- 
eas of activity that suffer from lack 
of appropriate tools and methods. 
One of the most notorious areas involves 
the use of cartographic information to lo- 
cate and set in context archaeological sites 
and other geographical places. Paper maps 
are often the only means of dealing with 
geographical locations, apart from lists of 
coordinates, which seldom solve any prob- 
lem. Of course, commercial Geographical 
Information System (GIS) packages exist, 
but none combine power with ease of 
use. They tend to have too many features 
and are less than intuitive for those lack- 
ing computer training— not to mention 
they are usually expensive or a pain in 
the neck to use. 

Consequently, I and other members of 
the Landscape Archaeology Research Unit 
at the University of Santiago de Com- 
postela decided to invest in research on 
simple cartographic representations for ge- 
ographic location and reference. As a re- 
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sity of Santiago de Compostela in Spain. 
He can be contacted at phcgon@usc.es. 


50 


Kormat 


sult, we designed and implemented a new 
data format and a small set of accompa- 
nying tools. 

The biggest problem when dealing 
with cartographic information is the huge 
amount of data needed to acceptably 
manage and display a medium-sized area. 
Our archaeological work is strongly based 
on the zoom principle, which says that 
any study must be done at several scales 
centered around the same area to be pre- 
cise and in context. Also, our area of 
work covers the whole Galicia, over 
30,000 square kilometers. In addition, we 





specialize in archaeological impact as- 
sessment, which often involves working 
in linear-track works such as motorways 
or pipelines, involving very long and nar- 
row work areas instead of the classical cir- 
cular ones. 

We envisioned a system capable of dis- 
playing a layered contour map, relieving 
users from intrusive tasks such as chang- 





ing sheets after hitting a sheet border or 
changing scales. Also, a major problem 
in some GIS tools is the huge number of 
files they create. Since we believe that the 
users shouldn’t have to worry about thou- 
sands of files and relationships among 
them, we decided that the system should 
integrate all the information about a spe- 
cific wide area in a single file, including 
different levels of detail. We did not at- 
tempt to perform automatic geographic 
generalization, but instead to store already- 
computed data about different levels of 
detail into one file. The system would 
then select the most appropriate data set 
from context information, such as the 
working scale, output destination, and 
user preferences. 

Furthermore, the huge amount of in- 
formation required to deal with cartogra- 
phy results in the need to index the data 
inside the files so retrieval is fast enough. 
A 2D indexing scheme was needed, be- 
cause cartographic data is almost always 
retrieved following an inside-rectangle test. 
Our own experiments showed that raw 
lists of coordinates with no indexing did 
well in small areas (up to 50 km/?), but 
performed badly above this limit. 


The CVS Data Format 
The result of our work is the CVS (which, 
in English, stands for the “Segmented Vec- 
torial Cartography”) data format, which 
stores homogeneous cartographic data for 
a specific geographic area into a single 
file, optionally including different levels 
of detail and offering a two-dimensional 
indexing scheme. 

A CVS file holds what in classical terms 
could be called a layer, or information rel- 
ative to a single thematic coverage. The 
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(continued from page 50) 

layer concept has been extensively used 
in the GIS world and is beyond the scope 
of this discussion. To obtain a complete 
map, several layers are usually necessary, 
so several CVS files are needed. 

The information inside a CVS file is par- 
titioned into levels, corresponding each 
to a level of detail at which the geographic 
information of the area can be represent- 
ed. In fact, all the levels in a CVS file rep- 
resent the same area, but at different de- 
tails. Thus, each level is more suitable to 
be displayed for a specific range of work- 
ing scales. Levels are the means by which 
the zoom principle can be successfully 
applied. 

Also, the information inside a CVS file 
is partitioned into sectors, correspond- 


Still reinventing? 


ing each to a rectangle on the area to be 
represented. Division in sectors is made 
separately for each level, so low- detail 
levels can be partitioned in few sectors 
(even in just one), and high-detail lev- 
els can be divided in up to 65,536 sectors 
in the current implementation. Sectors 
are the way to achieve two-dimensional 
indexing. 

Every sector in a CVS file contains curves, 
as the CVS format is initially oriented to deal 
with contour maps. Each curve is stored as 
a sequence of points. A curve spanning two 
or more sectors is accordingly split in as 
many curve segments as needed, all of them 
with the same curve identifier. We plan to 
improve the CVS specification with capa- 
bilities to store different kinds of informa- 
tion other than curves. 
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Format Description 

As Figure 1 illustrates, each CVS file con- 
tains a file header, data header, one or 
more levels, and one or more sectors for 
each level. The file header contains a mag- 
ic number identifying the file as a CVS file, 
information about format revision (cur- 
rently Version 2), and room for future ex- 
tensions such as the content type. Cur- 
rently, no content type is specified as only 
one is implemented. The data header 
holds the minimum and maximum values 
for x-, y-, and z-coordinates in the whole 
file, the number of levels in the file, and 
some data for each of them. In turn, each 
level contains a level header and data for 
each sector inside it. The level header car- 
ries the sector count for this level, and 
some information for each of them. Each 
sector holds a sector header and the car- 
tographic information itself in the form of 
curves. Curves are not indexed, and con- 
sist of a curve identifier and a sequence 
of coordinate triplets. 

The file pointers from the Levellnfo and 
SectorInfo data elements in Figure 1 point 
to LevelData and SectorData, respective- 
ly, and constitute the foundation of the 
indexing mechanism. The ScaleFrom and 
ScaleTo fields for each LevelInfo are stored 
in meters per pixel (MPP), a good way 
to express scale on digital media. The 
higher these values, the lower the level 
of detail. On a 17-inch monitor with a 
resolution of 1024x768 pixels, 100 MPP 
correspond to a 1:320,000 conventional 
scale. Also, 64-bit floating-point numbers 
are used to store coordinates, allowing the 
CVS data format to deal with Universal 
Transverse Mercator (UTM) coordinates, 
our system of choice as the whole of Gali- 
cia is contained into a single UTM zone. 

Finally, the CVS data format performs 
a little trick to improve data retrieval per- 
formance. Each sector stores all the ver- 
tices of the curves it includes, plus two 
more optional vertices for each curve, one 
before the first vertex inside the sector (in 
case the curve starts outside the sector), 
and the other after the last vertex (in case 
the curve ends outside the sector). This 
offers the whole path of a curve segment 
for each sector. See Figure 2 for details on 
these off-by-one vertices. 


How It Works 

Assume that a CVS file is stored on disk, 
and that some piece of software wants to 
read it to display a map. After checking 
the magic number to reduce the risk of 
file type conflicts, and verifying that the 
CVS revision of the file is compatible with 
that of itself, the software checks that the 
map it intends to display is intersected by 
the area specified by the data header fields 
Fromx, FromyY, FromZ, ToX, ToY, and 
ToZ. If not, no useful data is contained in 
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the CVS file. If this test is successful, the 
software scans through every LevelInfo el- 
ement to find the one with an appropri- 
ate range of scales, looking at the Scale- 
From and ScaleTo fields. Once found, the 
software can follow the Levellnfo’s Point- 
er into a LevelData, which will contain a 
header with the sector count and some 
information for each sector. Scanning 
through SectorInfo elements, the software 
builds a list of which sectors are to be re- 
trieved to draw the map, by computing 
whether or not each sector area, given by 
the Fromx, FromY, Tox, and ToY fields, 
intersects the wanted map area. Once this 
list is built, the software must iterate over 
it, navigating to the cartographic infor- 
mation by using each SectorInfo’s Point- 
er field into a corresponding SectorData. 
From this element, the software reads the 
curve count and starts iterating over ev- 
ery curve. Curves are not indexed or de- 
limited, so retrieving all the curves in a 
sector is, in the current form of CVS, a 
strictly sequential process. Each curve starts 
with a CurveHeader element that holds a 
curve identifier and a vertex count, after 
which follows a sequence of vertices, each 
one consisting of x-, y-, and z-coordinates. 


_ Tools 
We’ve developed a number of tools to 
work with CVS files. 


e Format converter, which converts CVS 
files from Drawing eXchange Format 
(DXF) files. DXF files are a common 
way to interchange vectorial drawings, 
and AutoCAD (our main digital input 
tool) can output them easily. Neverthe- 
less, we already had a DXF parser and 
converter developed in-house, so the fi- 
nite state machine implementation to 
extract information from DXF files ex- 
isted already. We decided to let our DXF 
parser convert DXF files into an inter- 
mediate format called “DAT,” and build 
a DAT-to-CVS converter. 

The DAT2CVS converter works by first 
specifying a DAT input file and a CVS 
output file (which will be created), and 
then by mapping one or more layers in 
the DAT file — retained from the origi- 
nal DXF file—to each of the desired 
levels in the output CVS file. A DAT lay- 
er can be input to none, one, or sever- 
al levels, and each level can merge data 
from one or more DAT layers. 

After specifying how many levels are 
desired, and setting the mapping op- 
tions between layers and levels, a quad 
tree depth must be chosen for each lev- 
el. The whole area spanned by the DAT 
file is then recursively divided into quar- 
ters up to the selected depth. The cur- 
rent implementation of the DAT2CVS 
converter allows the user to specify a 
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lower limit of 0 (resulting in just one 
sector, or no division) and an upper lim- 
it of 8 (resulting in 65,536 sectors), al- 
though the CVS format is not limited in 
this way. 

Some options can also be changed, 
such as the directory for temporary files 
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and the Ratio Source to Destination 
(RSD), which is used to decrease the 
amount of disk space used during the 
conversion. (Reducing the RSD can low- 
er required disk space during conver- 
sion, but the more it is reduced, the 

(continued on page 50) 
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Figure 1: The CVS data format. Magic equals OXABCDEFS88, and CVSRevision 
equals 2. Definitions on the left side are developed on the right side. Data 
elements with labels that end with a colon are defined later. Numbers preceded 
by a plus sign above the data elements show the byte offset of each data element 
from the start of the definition. Numbers below data elements show the length in 
bytes of the data element (4 means a 32-bit integer and 8 means a 64-bit 


floating-point number). 





Figure 2: Curve segment inside a sector. In (a), the curve is represented by a 
thick dark line. Curve vertices are marked as small circles. The sector boundaries 
are drawn as rectangles. Vertices inside the sector are marked dark, while 
vertices outside the sector are marked blank. The curve path for the sector is 
drawn with a thin gray line underimposed to the curve line. Notice that two 
vertices outside the sector are needed in order to fully specify the curve path. 
These two vertices are called “off-by-one vertices.” With off-by-one vertices, the 
curve segment can be drawn as in (b). Without them, the curve could only be 


drawn as in (c). 
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(continued from page 53) 
greater the chances of unrecoverable er- 
rors. In the case of such errors, the 
DAT2CVS converter sends a message, 


recommends adjusting the RSD, and 
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Figure 3: CVSView. The file name 
and size are displayed at the top. The 
CVS revision number follows. Data 
header information is then displayed, 
including the byte offset of each field 
in the file (in both hexadecimal and 
decimal notations) and the field value 














quits. Conversion must then be started 
again.) Huge amounts of disk space are 
usually needed during conversion. As a 
rule of thumb, an RSD of 70 percent 
usually works without a glitch when 
converting conventional maps with an 
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even coverage of contours and four or 
more sectors. In case of error, raise the 
RSD up to the safest 100 percent. 

The format converter (available elec- 
tronically; see “Resource Center,” page 
5) is written in Visual Basic and includes 
a complete setup package (including 
DLLs and other components). 
Access library, which encapsulates the 
particularities of the CVS data format. 
In its current form, it is an ActiveX DLL 
that exports five classes: Layer, Levels, 
Level, Sectors, and Sector. It has been 
used with Visual Basic 5 programs with 
great success. The code for the CVS ac- 
cess library presented in Listing One 
opens a CVS file, gets its first level, and 
dumps all the curves and vertices of all 
sectors in that level. To use this code, 
you need the CVS access library and Mi- 
crosoft Visual Basic 5. 
Viewing and dumping tools, which (as 
their names suggest) view and dump 
the contents of CVS files. CVSView 
dumps the contents of any CVS file to 
the desired depth. Figure 3 shows a 
CVSView dump. 

CVSEdit is more sophisticated, as it 


shows the internal structure of a CVS 
file in the form of a tree, and allows edit- 
ing some data fields such as scale ranges 
for each level or coordinate values. 


itself. Notice the first level pointer at 
offset OxS4, pointing at Ox9C, and 
corresponding level data at the bottom 
starting at that offset. 


Figure 4: CVSEdit. The hierarchy of 
levels, sectors, curves, and vertices can 
be seen. Properties with a blue icon 
are user editable. 
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CVSEdit uses the CVS access library de- 
scribed above. Figure 4 shows this tool. 

e Visualizing tools, which test the perfor- 
mance and usability of cartographic user 
interfaces based on the CVS data for- 
mat. They use the CVS access library to 
read and display several layers of in- 
formation accounting for 20.5 MB. The 
tools allow panning and zooming, au- 
tomatically computing which level is 
best to use and which sectors to dis- 
play. Figure 5 shows CVSTest. 


Current Use and Future Enhancements 

The CVS data format is currently being 
used to provide cartographic facilities to 
our main information system, used by 25 
simultaneous users several hours a day. 
The CVS files being used cover the whole 
Galicia, and integrate the full 1:100,000 
cartography of this area. CVS files live on 
our applications server, and each client 
reads them through a local copy of the 
CVS access library. The first improvement 
we have made to the described set of tools 
is to port the CVS access library from Vi- 
sual Basic 5 to Visual C++ 5, achieving 
some performance improvements. (We 
have not performed measured tests, but 
our experience indicates that slight im- 
provements are mainly due to the disk- 
access mechanisms used by C++ libraries 





in comparison to that of Visual Basic. 
Thanks to my colleague Roberto Gomez, 
who ported the CVS access library into 
Visual C++.) Currently, we are planning 
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to redesign it as a server-side component 






LR 


so only the selected sectors travel through 
the network, and the sequential portion 
of the work (iterating over all curves in 
each sector) benefits from being execut- 
ed on the server. We have experimented 
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Figure 5: CVSTest. Contours and rivers can be seen on the map, corresponding 


to two different CVS files. Sectors for the current level of the contours file are 
displayed in blue. An information window (floating on the map) shows data 


about levels and sectors being used. 
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* remote administration 
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with DCOM and found it suitable for a 
design like this. 

We are also planning to convert the 
CVSTest tool into an ActiveX control, in- 
cluding a canvas and enough function- 
ality to draw and manage multilayer 
maps, so any application written in any 
ActiveX-hosting language could use it. 
Also, the DAT2DXF converter must be 
improved both at the performance and 
disk space requirements sides. Finally, 
extending the CVS data format to host 
content kinds other than curves is easy 
and will be done sometime. We consid- 


er digital elevation models and archaeo- 
logical site distributions as candidate con- 
tent kinds. 

We routinely overlay other data that we 
use (such as archaeological sites) on top 
of CVS layers, pulling it from Microsoft SQL 
Server 6.5, Microsoft Access 96, and CA Jas- 
mine databases, depending on the system. 
Our main internal working system, the SIA+ 
Archaeological Information System (an in- 
tegrated information system for the man- 
agement of archaeological sites and finds, 
assessments, projects, people, documents, 
and images; see http://wwwetarpa.usc.es/), 


pulls data from a 45-MB Access database 
to show geographic locations, zones, and 
sites atop the CVS layers. 

The CVS data format is an inexpensive 
and easy-to-use solution for those appli- 
cations that need displaying and making 
operations with contour maps. We know 
that many improvements are still neces- 
sary to make the CVS data format a pro- 
fessional solution. Any help or collabora- 
tion will be welcome. 
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Listing One 


‘open a CVS file. 
Dim ly As New Layer 
ly.OpenFile "C:\Temp\Test.cvs" 


Dim 1VertexCount As Long 
sc.GetCurveInfo lId, 1VertexCount 
‘output data. 
Debug.Print " Curve " & CStr(lId) & " with " 
& CStr(1VertexCount) & " vertices:" 
‘iterate vertices for this curve. 
Dim 1VertexIdx As Long 
For 1VertexIdx = 1 To 1VertexCount 
"get vertex data. 
Dim dX As Double, dY As Double, dZ As Double 
Dim bInside As Boolean 
sc.GetVertex dX, dY, dZ, bInside 
‘output data. 
Debug.Print " 


"get the first level. 
Dim lv As Level 
Set lv = ly.Levels(1) 


‘iterate all sectors. 
Dim 1SectorIdx As Long 
Dim sc As Sector 
For 1SectorIdx = 1 To lv.Sectors.Count 
"get sector. 
Set sc = lv.Sectors(1SectorIdx) 
‘output data. 
Debug.Print "Sector " & CStr(1SectorIdx) & ":" 


Vertex (" & CStr(dX) & ", " & CStr(dY) & ", 
"& CStr(dZ) & ") " & CStr(bInside) 
Next 1VertexIdx 
Next 1CurveIdx 


‘end retrieving curve data. 
sc.EndGetData 


"begin retrieving curve data for this sector. 
Next 1SectorIdx 


Dim 1CurveCount As Long 
sc.BeginGetData 1CurveCount 
‘close CVS file. 
"iterate all curves in this sector. ly.CloseFile 
Dim 1CurveIdx As Long, lId As Long 
For 1CurveIdx = 1 To 1CurveCount 
"get curve info. 
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Agent Itineraries 





An alternative data 
structure for 
agent systems 





Russell P. Lentini, Goutham P. Rao, 
and Jon N. Thies 


tineraries are used in many agent archi- 

tectures, including the Concordia, IBM- 

Aglets, Voyager, Oddessey, and EMAA 

agent architectures. These architectures 
treat an itinerary as an enumeration or list 
of tasks to be performed by an agent. In 
this article, we'll take a different approach, 
treating an itinerary as a metaprogram— 
a way of programming an agent and in- 
advertently its goal. In the process, we'll 
show how easy it is to build agent archi- 
tectures using Java and a few straightfor- 
ward concepts. Our ultimate goal is to il- 
lustrate how you can enhance agent 
architectures by concentrating on a critical 
data structure— the itinerary. A flexible, 
generic itinerary data structure allows for 
greater expressive power when defining 
application-specific agents. 

To illustrate itineraries, we'll present an 
itinerary that performs a straightforward 


At the time this article was written, the au- 
thors were members of the engineering 
staff at Lockheed Martin Advanced Tech- 
nology Laboratories. They can be contacted 
at rlentini@atl.lmco.com, grao@gradient. 
cis.upenn.edu, and jthies@gradient.cis. 
upenn.edu, respectively. 
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database query. However, we’ve imple- 
mented itineraries on several real-world 
applications, including a distributed fault- 
tolerant information discovery system. In 
that system, agents performed abstract 
queries using a syntax similar to what you 
would use on the Web through, say, Lycos 
or Alta Vista. The agents were able to over- 
come problems such as low-bandwidth, 
frequently disconnecting networks, or even 
database machines being temporarily un- 
available. Another application in which 





we've used itineraries is a distributed data 
fusion/threat-assessment engine, where 
the agent moves among computers mon- 
itoring real-time signals from sensors. 
The system then uses the data-fusion 
logic to sense nearby threats. 

Our approach to agent itinerary bor- 
rows ideas from a fundamental comput- 
ing model—the Finite State Machine 
(FSM). Theoretically, FSMs attempt to cap- 
ture the general nature of computation of 





certain computational procedures. Con- 
ceptually, FSMs have a finite set of states, 
alphabets, and instructions. Physically, they 
can be visualized as a model of compu- 
tation that has storage capable of re- 
membering their current state. They also 
have the capability to read input and com- 
pute both a next state function and an out- 
put function. The next state function de- 
termines the next state the machine should 
be in based on the current state, the in- 
put, and a simple set of instructions that 
map input signal and state pairs to a new 
state. The output function determines the 
output alphabet of a machine for a state 
(see Figure 1). 

Formally, an FSM is a sextuple like that 
in Figure 2(a), where K is a finite set of 
states that the machine can be in, ¥ is a 
finite set of input symbols, O is a finite set 
of output symbols, and s is the initial state. 
The Next State function, 6,in Figure 2(b), 
is the program to the FSM and is called 
the “transition” function. It specifies, for 
each combination of current state s € K 
and current symbol i € %, a new state s’€ 
K; see Figure 2(c). 

Finally, 6): K > O, the output function, 
specifies the output symbol for each state. 
A configuration of an FSM is an instanta- 
neous description of the machine in its 
computation path. A configuration speci- 
fies the current state, and input symbol 
being scanned by the machine and the 
output symbol. 

So what does this mean? FSMs can be 
used to simulate certain computational 
procedures. They are good for describing 
the general flow of an algorithm re- 
sponding to input. The implicitly generic 
nature of the transition function provides 
much of this power. At the same time, we 
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(continued from page 60) 

view an agent’s itinerary as a metapro- 
gram, or as a way of programming an 
agent. It is basically a set of commands 
for controlling an agent by specifying the 
tasks that are to be performed under cer- 
tain conditions. We start by borrowing 
some concepts of an FSM with the hope 
that the metalanguage is flexible and has 
the expressive power to describe compli- 
cated itineraries. 

Formally, we define a Finite State (FS) 
itinerary as the quadruple in Figure 2(d), 
where K is a set of states that the agent can 
be in. The state of an agent describes the 
task or program that the agent executes. » 
is a set of conditions where a condition is 
basically a trigger upon which the agent 
acts, much like an FSM’s computation path 
depending on the input alphabet. Simply 
put, the result of the evaluation of a con- 
dition occurring in real time is what causes 
an agent to enter a state, executing the pro- 
gram represented by that state. 

We define c as the initial configuration 
of the FS itinerary, where a configuration 
is a state-condition pair; see Figure 2(e). In 
other words, the configuration is the cur- 
rent state of the agent and the most recent 


condition that has been evaluated. Config- 


urations arising during execution of the 
itinerary are generated in real time. This lets 
the agent react to real-time situations. 

Finally, 6, the transition function, is the 
metaprogram for the agent’s itinerary. It 
describes the transition from the current 
configuration to the next configuration. 
The transition function is basically a way 
of programming an agent to execute cer- 
tain programs (specified by a state) under 
certain conditions. Any subsequent config- 
uration is obtained by examining the cur- 
rent configuration’s state and by evaluating 
its corresponding condition at run time. This 
is the fundamental difference between the 
FS itinerary approach and that taken by 
existing agent systems that incorporate the 
concept of an itinerary. 

As Figure 3 shows, the FS itinerary is 
basically a data structure capable of storing 
a transition graph of various configurations 





Figure 1: The output function 
determines the output alphabet of the 
machine for a state. 
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that describe to the agent how to move from 
one configuration to another. The data struc- 
ture is similar to an FSM. 

Any particular instance of an agent will 
define to its FS itinerary the metaprogram 
(transition function graph) that will navi- 
gate the agent through various configura- 
tions. The initial state of the initial con- 
figuration is spurious and is not the result 
of any condition. The FS itinerary will start 
with the initial configuration and evaluate 
the initial condition specification. Evalu- 
ating a condition specification will depend 
on real-time conditions that exist around 
the agent (analogous to input symbols to 
the FSM) and the result of this evaluation 
is an indication of the current conditions. 
Once the current conditions have been 
evaluated, the FS itinerary invokes the tran- 
sition function to get the next configura- 
tion to move into. The transition function 
determines the next configuration based 
upon the current conditions and the state 
of the agent. This new configuration in- 
dicates the new state for the agent, and 
the FS itinerary will execute the program 
associated with this state. The FS itinerary 
again evaluates the condition specifica- 
tion of this new configuration and in this 
manner, continues to execute the metapro- 
gram of the agent, until a final (halting) 
configuration has been reached. 


Java and a 

Layered Agent Architecture 

As Figure 4 illustrates, the agent system 
we present here is implemented as a lay- 
ered architecture. At its core lies the FS 
itinerary, with functionality added in lay- 
ers. The agent layer provides an interface 
around the system to the agent applica- 
tions. A good design for the agent layer 
would shield you from the intricacies of 
mobility and state transitions. 

Listings One through Five present the 
Java package finitestateitinerary, which de- 
fines five components that comprise the 
FS itinerary block. 

The State component (Listing One) 
generically represents the program to be 
executed by the itinerary on reaching this 


Figure 2: Defining FSMs. 
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(continued from page 62) 

state. The component is implemented as 
a Java abstract class. Its functionality is de- 
fined in me ates ) method and is appli- 


~fealine a 


Figure 3: The FS itinerary is basically 
a data structure capable of storing a 
set of instructions. 


Figure 4: An agent system 
implemented as a layered architecture. 








cation dependent. Furthermore, it declares 
two static states, INIT and HALT, that are 
standard to any agent metaprogram. 

The Transition Function interface (List- 
ing Two) represents the transition policy 
in choosing the next configuration for the 
FS itinerary. Your implementation of this 
interface will evaluate run-time conditions 





Figure 5: Flow chart of the agent 
created using the finite state itinerary. 


to select a new configuration. As we will 
see in our example, such conditions may 
depend on previous states. As far as the 
FS itinerary is concerned, all that is re- 
quired of this component is the ability to 
return the next configuration. 

Listing Three is the Configuration com- 
ponent, a placeholder for the State com- 
ponent and the transition function at that 
node in the FS itinerary. This component 
extends the State class and implements the 
Transition Function interface. The class 
FSIConfiguration (Listing Eight) imple- 
ments the Configuration component. FS/- 
Configuration stores a reference to the 
state that is to be executed when the FS 
itinerary enters this configuration. It de- 
fines an addTransition() method that 
maps a condition (Listing Nine) to a new 
configuration. This is a key function in 
building the finite-state-transition graph 
that defines the metaprogram. In its im- 
plementation of nextConfiguration( ), this 
class returns the configuration associated 
with the first condition that was satisfied, 
giving a very simple transition policy. The 
run() implementation of this class calls 
the run() method on the state object that 
it contains. We will see in our example 
that there will be certain configuration tran- 
sitions that do not depend on any condi- 
tion evaluation. The condition component 
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(continued from page 64) 

defines static DEFAULT_CONDITION for 
this purpose and it always returns True, 
causing the configuration that it is asso- 
ciated with to always execute. 

In Listing Six, we define a mobility con- 
dition. This is the first building block to- 
ward agent mobility and a good example 
of a condition object. The purpose of this 
condition is to evaluate whether there is 
a need for the agent to migrate to a des- 
tination machine. The run() implemen- 
tation of the MigrateState class (Listing 





Seven) migrates the FS itinerary to the de- 
sired machine. The combined effect of 
the mobile condition and state is that 
when the mobility condition is added to 
a configuration, it causes a transition to 
a migrate state and all subsequent con- 
figurations will execute at the specified 
machine. The MigrateState actually mi- 
grates the entire FS itinerary to the desti- 
nation machine. A network communica- 
tion with a machine requires a daemon 
process to be running at the receiving 
host. For 


Figure 6: Finite state automaton corresponding to the metaprogram in Listing Ten. 
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tion link object, which defines a startSer- 
vice() method that creates a server sock- 
et monitored for incoming connection re- 
quests. (By the way, the communication 
link service startService() must have al- 
ready been started on the machines that 
the agent will visit.) The actual socket 
code has been omitted for the sake of 
brevity. The method’s implementation 
would receive an object using java.io.Ob- 
jectInputStream.readObject(), which is 
the FS itinerary class from some other 
machine. The code would then proceed 
to execute the FS itinerary from its cur- 
rent configuration (Java object serializa- 
tion maintains the state of an object). The 
migrate(_) method of the migrate state 
would contain code to make a client con- 
nection to another machine that had ex- 
ecuted the startService() method, and 
transmit the FS itinerary using java.io.Ob- 
jectOutputStream.writeObject(). 

Listing Four is the Finite State Itinerary 
component, the actual data structure that 
executes the itinerary. It requires, during 
construction, a reference to a configura- 
tion component (the root node of the 
finite-state-transition graph). The run() 
follows a simple logic: 


1. Instruct the current configuration’s con- 
ditions to be evaluated. 
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: continued from page 66) 

2. Retrieve the next configuration from the 
nextConfiguration() method of the cur- 
rent configuration and set the current 
configuration to be equal to this new 
configuration. 

3. Execute the program associated with this 
new configuration. Return to step one. 


At any point during execution of the 
itinerary, a finitestateitinerary.Halt (Listing 
Five) may be thrown, causing the itinerary 
to stop execution. Compare step one with 
the MobileCondition object that we de- 
fined. The evaluate() method will return 
False when invoked on the destination 
machine and True otherwise. When Mo- 
bileCondition.evaluate() returns True, 
nextConfiguration() returns a configura- 
tion that contains a MigrateState, which 
will cause the itinerary to be transmitted 
to the correct machine. There, it will re- 
sume execution from its current configu- 
ration. This ensures that all states are ex- 
ecuted at their correct destination 
machines. The current configuration is au- 
tomatically maintained because we use 
Java Object Serialization in the Commu- 
nicationLink, which again starts the exe- 
cution of the received itinerary. 

We have omitted the listing for the Agent 
component since it is fairly straightforward 
to visualize an agent wrapper around the 
finitestateitinerary package. This finites- 
tateitinerary.agent package, at the least, 
would contain an agent class that encap- 
sulates the FS itinerary, mobility conditions, 
a custom configuration component, and a 
transition policy. Such a package would cor- 
respond with the notional concept of the 
agent paradigm, while providing a uniform 
and consistent interface into the underly- 


ing packages. The database application we 
present is straightforward and does not re- 
quire this special agent class. However, 
building such a wrapper would be essen- 
tial to a complete mobile-agent package. 


Our approach to 
agent itinerary 
borrows ideas from 
a fundamental 
computing model— 
the Finite State 
Machine 





The Database Application 

To illustrate the FS itinerary, Listing Ten 
illustrates the building of the metaprogram 
transition graph that implements the flow 
chart in Figure 5. With the power of the 
FS itinerary, we are able to construct con- 
tingency plans at each step along the way. 
Tracing the flow chart, you can see that a 
database query is to be executed at a ma- 
chine called ENIAC. However, if the 
itinerary is at some other machine, the 
mobility condition will cause a transition 
to a migrate state configuration, which 
moves the itinerary to ENIAC. The itinerary 
then performs the database query and tests 
the number of records retrieved. The 


query state has two conditions that are 
now evaluated. If the number of records 
is less than one hundred, the itinerary ex- 
ecutes some alternative query. If the num- 
ber of records is more than one hundred, 
the itinerary halts. This metaprogram is a 
direct mapping to the FS automaton in 
Figure 6. Although this metaprogram ex- 
hibits a simple task list with simple con- 
tingencies, arbitrarily large metaprograms 
could be built in this fashion to map di- 
rectly to their own FS automaton. 


Conclusion 

An enhancement to our implementation 
would be to add the capability of the tran- 
sition function to return more than one 
configuration from the nextConfigura- 
tion() method. This would require a 
change in the run() method of the FS 
itinerary to handle an enumeration of con- 
figurations. This change would represent 
the spawning of multiple FS itinerary eval- 
uations for a given configuration, and 
would allow states to be executed in par- 
allel— a powerful addition to the imple- 
mentation presented. 

Here, we have used an iterative ap- 
proach in executing the itinerary. How- 
ever, another implementation of this con- 
cept might be programmed to be event 
driven. Using Java event notification, the 
next configuration can be computed au- 
tomatically on the previous configuration 
raising an event. 

We have presented an adaptable design 
for creating agent itineraries. FS Itinerary 
is based on a fundamental model of com- 
putation, which is applied here as a pow- 
erful representation of an agent’s goal solv- 
ing strategy. 

DDJ 





Listing One 


// FILE State. java 
package finitestateitinerary; 
/** 


* This class is an encapsulation of the program that must be run by 
* the FS Itinerary when it is in a particular configuration. 


* The entry point is the run method. 


public Configuration nextConfiguration() throws Halt; 


Listing Three 


// FILE Configuration. java 


package finitestateitinerary; 


*/ /** 
public abstract class State implements java.io.Serializable { * 
/* INIT state is the initial state representation of the finite * 


* state machine. */ 

public static final State INIT = new State() { 
public void run() ( 
} 

7; 


The configuration class holds the state to be executed and transitions 
from this configuration to new ones based on real-time conditions. The 


* next configuration is chosen by the implementation of the 


* TransitionFunction 
*/ 


public abstract class Configuration extends State 


/* HALT state is the final state representation of the finite 


* state machine */ 
public static final State HALT = new State() { 
public void run() throws Halt { 
throw new Halt("HALT : 
} 
i 


Halted because of normal termination") ; 


Listing Four 


implements TransitionFunction { 


// FILE FiniteStateItinerary. java 


/* Applications must implement this method for functionality */ 
public abstract void run() throws Halt; 


~S 


package finitestateitinerary; 
import java.io.*; 
/[** 
* The run method of this class actually starts running the meta-program. 
It begins with the initial configuration, calls it's nextConfiguration 


Listing Two 


// FILE TransitionFunction. java 
package finitestateitinerary; 
/[** 


* The application must implement the TransitionFunction class to return 


* the next configuration. 
+/ 
public interface TransitionFunction { 
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* 
* method. 
* 


* 
*/ 


The nextConfiguration method evaluates real time conditions and 
and returns the next configuration. 
and is never executed by the itinerary. 


The initial state is a spurious state 


public class FiniteStateItinerary implements Serializable { 
private Configuration configuration = null; 


public FiniteStateItinerary (Configuration configuration) { 


this.configuration = configuration; 


(continued on page 70) 
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(continued from page 08) 


} 
public void run() { 
try ( 
/* Loop while the current condition specifies that the itinerary 
* should continue evaluation. */ 
while (true) { 
/* Get the next configuration and hence the corresponding 
* state to move into. */ 
configuration = configuration.nextConfiguration() ; 
if (configuration == null) { 
return; 
} 
/* Run the program associated with the state object. */ 
configuration. run() ; 
} 
} catch (Halt halt) ( 
} 
/* At this point, either final condition or an unsatisfyable 
* condition has been encountered, or the machine was halted. */ 
} 


° e e 
Listing Five 
// FILE Halt. java 
package finitestateitinerary; 


public class Halt extends Throwable { 
} 


e e e 
Listing Six 
_// FILE MobileCondition. java 
package finitestateagent; 
import java.io.*; 
import java.net.*; 
a finitestateitinerary.*; 
4k 
* This implementation of the Condition object provides the mobile nature 
* to agent applications. It causes the Finite State Itinerary to be written 
* to a computer. It relies on the CommunicationLink class to accomplish this 
* via object serialization. If the evaluate() method returns true, the 
* transition fucntion will choose the corresponding configuration as the 
* next configuration. If this configuration conatins a migrate state, the 
* FS itinerary will migrate to the destination machine. 


public class MobileCondition extends Condition implements Serializable { 
private InetAddress destination = null; 
public MobileCondition(InetAddress destination) { 
this.destination = destination; 
} 


public boolean evaluate() { 
try { 
if (InetAddress.getLocalHost().equals(destination)) { 
return false; 
} 


else { 
return true; 
} ’ 
} catch (UnknownHostException e) { 
throw new Halt(); 
/* This computer is not suitable for IP communications */ 


e e 

Listing Seven 

// FILE MigrateTask. java 

package finitestateagent; 

import java.net.*; 

import finitestateitinerary.*; 

/** 
* This state causes the entire finite state itinerary to be sent to the 
* destination machine via the CommunicationLink, where the finite state 
* itinerary will resume execution. This state then halts the execution of 
* the itinerary on the current machine by halting the current thread, since 
* a copy of the FS itinerary is now running at the destination machine. 


public class MigrateState extends State { 
public MigrateState(InetAddress destination, FiniteStateItinerary 
itinerary) 
this.destination = destination; 
this.itinerary = itinerary; 
} 
/[** 
* The run method is implemented to migrate the agent 
. that owns this state, to a new machine 
* 
public void run() throws Halt { 
/* Migrate this finite state itinerary to the destination machine */ 


/* Stop the local execution of this instance of finitestateitinerary 
* since a copy is now running on a new machine */ 
throw new Halt ("HALT : Because itinerary transfered to destination" ); 


e e e 
Listing Eight 
// FILE FSIConfiguration. java 
package finitestateagent; 
import java.net.*; 
import java.io.*; 
import java.util.*; 
import finitestateitinerary.*; 
/** 
* A configuration with a default deterministic transition function with 
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y mobility support 
* 
public class FSIConfiguration extends Configuration 
implements Serializable { 
private State state = null; 
protected Vector transitions = new Vector(); 
public FSIConfiguration(State state) { 
this.state = state; 
} 


/** 
* Add a condition and a target configuration branch on the finite 
* state machine 
* / 
public void addTransition(Condition condition, 
Configuration configuration) { 
transitions.addElement (new Transition(condition, configuration)); 


} 


/[** 
* Retreive the first met configuration to move into 
*/ 
public Configuration nextConfiguration() throws Halt { 
for (int i = 0; i < transitions.size(); itt) { 
if (((Transition)transitions.elementAt(i)).condition.evaluate()) { 
return ((Transition)transitions.elementAt (i) ).configuration; 
} 


} 
throw new Halt("HALT : No satisfiable condition") ; 
} 
/[** 
* Execute the state 
*/ 
public void run() throws Halt { 
state.run(); 
} 
/** 
/ Represents one transition mapping 
* 
private class Transition implements Serializable { 
private Condition condition = null; 
private Configuration configuration = null; 
public Transition(Condition condition, Configuration configuration) { 
this.condition = condition; 
this.configuration = configuration; 


e 6 e 
Listing Nine 
// FILE Condition. java 
package finitestateagent; 
import java.io.*; 
import com.lmco.atl.finitestateitinerary.*; 
public abstract class Condition implements Serializable { 
/* This static class always causes a transition to the associated 
* configuration */ 
public static final Condition DEFAULT_CONDITION = new Condition() { 
public boolean evaluate() { 
return true; 
} 


74 
public abstract boolean evaluate() throws Halt; 


e e 
Listing Ten 

import finitestateitinerary.*; 
import finitestateagent.*; 
import java.net. InetAddress; 


public class DemoAgent { 
public FiniteStateItinerary createMetaProgram(InetAddress eniac) { 

// Create the root configuration 
FSIConfiguration root = new FSIConfiguration (State. INIT) ; 
// Create the itinerary 
FiniteStateItinerary fsi = new FiniteStateItinerary (root) ; 
// Now create all other configurations 
// NOTE : jdbc.JDBCQueryTask would be implemented to simply 
// perform the query passed on some database. 
jdbc. JDBCQueryTask jdbcTaski = new jdbc. JDBCQueryTask ( 

"SELECT Something FROM Somewhere") ; 
jdbc. JDBCQueryTask jdbcTask2 = new jdbc. JDBCQueryTask ( 

"SELECT SomethingElse FROM SomewhereE1se") ; 
FSIConfiguration dbConfigi = new FSIConfiguration(jdbcTask1) ; 
FSIConfiguration dbConfig2 = new FSIConfiguration(jdbcTask2) ; 
FSIConfiguration migrateConfig = new FSIConfiguration( 

new MigrateState(eniac, fsi)); 
FSIConfiguration haltConfig = new FSIConfiguration(State.HALT) ; 
// Link the above configurations using conditions 
// NOTE : In our implementation of 
// TransitionFunction.nextConfiguration, we traverse a Vector to 
// test conditions. This means that the conditions will be 
// tested in the order they are added. 
root.addTransition(new MobileCondition(eniac), migrateConfig) ; 
root.addTransition(Condition.DEFAULT_CONDITION, dbConfig1) ; 
migrateConfig.addTransition(Condition.DEFAULT_CONDITION, dbConfig1) ; 
// NOTE : DBCondition tests for the number of records 
// retreived by a query. In this example, if the number of records 
// is less than 100, the condition returns true. Otherwise, 
// the 2nd transition will be taken. 
dbConfigl.addTransition(new DBCondition(jdbcTask1, 100), dbConfig2); 
dbConfigi.addTransition(Condition.DEFAULT_CONDITION, haltConfig) ; 
dbConfig2.addTransition(Condition.DEFAULT_CONDITION, haltConfig) ; 
// Return the transition graph 
return fsi; 


DDJ 
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Java and Digital 





Capturing, storing, and 
retrieving digital images 





David H. Martin and Johnny Martin 


he ability to capture, store, and retrieve 

images is an often-overlooked feature 

that can benefit many applications. The 

recent introduction of low-cost video- 
capture hardware has created a significant 
market for videoconferencing and online 
collaboration software. In addition, image 
capture, storage, and retrieval capabilities 
are potentially useful in more mainstream 
software applications. Consider, for exam- 
ple, a patient-care application that stores a 
patient’s photograph to reduce the chances 
of misidentification. Other applications of 
low-cost video-capture hardware include 
inventory control, surveillance, security sys- 
tems, or adding marketing appeal to de- 
mos of software that lacks highly visible 
features. (Demos that take snapshots of 
people’s faces and store them continuous- 
ly can be very effective at demonstrating a 
Java application’s database capabilities, for 
example.) | 

C++ applications have imaging and 
video libraries readily available. On Win- 
dows, the standard API for accessing 
video-capture devices is Video for Win- 
dows. A C++ program written against this 
API should work with any Windows- 
compatible camera. But what if you’re 
developing in Java? 

Interfacing Java applications to a 
video-capture device poses a special 
challenge because there is currently no 
easy way to access the camera from Java. 


David and Johnny are cofounders of Object 
Guild Inc., specializing in object-oriented 
consulting, training, and software. They can 
be contacted at davidm@objectguild.com and 
johbnnym@oljeciguild.com, respectively. 
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Images 


(The Java Media Framework API from 
Javasoft does not address video capture 
in the 1.0 release.) 

The Java VM presents a barrier between 
applications and C/C++ APIs used to ac- 
cess the video camera. To access these 
APIs from Java, you must not only write 
JNI methods, but must also address im- 
age conversion problems, performance is- 
sues, and thread synchronization: 


e Captured images need to be convert- 
ed to an image format readable from 
Java. Images returned by Video for Win- 
dows may be in one of several different 
formats, depending on the resolution, col- 
or model used, and bits per pixel. 


te iS CONNECH 1, OUST WiISt, CGHTOCS 1 Ge¥ICe, °/ 





e Pixels need to be copied into the Java 
VM’s memory space. If the application 
needs to capture frame-by-frame video, 
these memory transfers need to be op- 
timized for speed. 

e Many low-level video-capture APIs use 
callback functions. Handling the call- 
backs requires synchronizing multiple 
threads in both Java and C++. 


There are three approaches to incor- 
porating video or image capture into a 
Java application, each with different us- 
ability/complexity tradeoffs: 





No integration. Implement the image 
capture feature as an “open file” dialog, 
allowing users to select GIF or JPEG im- 
age files for the Java application to load. 
It is up to users to run third-party image 
capture utilities. 

This approach avoids the problem alto- 
gether. The application gets images from 
a file, which could have come from a sep- 
arate image-capture program connecting 
to a video device, or from any other source. 
All the application needs to do is read a 
GIF or JPEG image file, a trivial task in Java. 

This may be an appropriate solution if 
the need for image capture is uncom- 
mon. It is cumbersome for the user. Not 
only does the user need to run a sepa- 
rate application to capture and save the 
image, but must also remember the im- 
age file location, and locate that file in a 
Java dialog. 

Loose integration. When image cap- 
ture is needed, the Java application exe- 
cutes a separate C/C++ application that 
lets users interactively capture images. The 
application saves an image as a GIF or 
JPEG image file in a predefined location, 
and signals the Java application, which re- 
trieves the file. 

This is really an automated version of 
the “no integration” approach. The appli- 
cation spawns the image capture program 
for users, and automatically retrieves the 
file when users have closed the image- 
capture program. 

Users do not need to manually start a 
separate application, and do not need to 
worry about saving and retrieving the im- 
age file. 

This solution burdens you (the pro- 
grammer) with the need to write a cus- 
tom image-capture program in C/C++. In- 
stallation is more complex, as a separate 
native executable must be installed along 
with the Java application. 

The biggest drawback with this approach 
is cosmetic. If the Java application relies on 
custom widgets or Swing components for 
the user interface, the image-capture ap- 
plication will unavoidably have a different 
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(continued from page 72) 
look-and-feel from the Java application. 
This presents an unprofessional appear- 
ance to users, as the look-and-feel of the 
image-capture screen is different from the 
rest of the application’s UI. 

Full integration. As you might guess, 
this approach involves being able to di- 
rectly access the video device from Java, 
using a combination of Java and native 
C++ methods to connect to the camera, 
capture image frames, and convert them 
to Java’s image format. 

From the usability standpoint, this is the 
best approach. Users need not perform ex- 
tra steps to connect to the camera; when 
they want to take a picture, it is done 
through a normal Java dialog. This is es- 
sentially an “all Java” solution, albeit with 


| WinDK™ for Windows prares | 


e Over 50 samples speed your 
path to NT/WDM device drivers 

¢ New samples for USB, 1394 
and PCI 

e Wizard generates correct code 
for Power Management and 
Plug and Play 

e New C++ classes and C 
modules 

e Supports Windows 98, NT 4.0 
and NT 5.0 

e Over 1100 pages of 
documentation 


Need to handle Windows 2000 
correctly? 


There’s oy one answer! 


underlying native code in the back end. 
Thus, it provides smoother UI flow for users, 
and more capabilities, allowing video as 
well as still-frame capture. 

The bad news is that this approach is 
the most difficult to implement. You must 
interface with the camera driver in C++, 
implement efficient image format conver- 
sions and memory transfer operations, and 
handle JNI memory-management and 
thread-control issues. 

The good news is that it is possible to 
encapsulate this solution in a set of class- 
es with a public API. If the API is de- 
signed properly, this approach becomes 
no harder to implement than the no in- 
tegration approach. At Object Guild, 
we've implemented such an API, called 
“Grabber for Java.” 
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Description of Grabber 

Grabber consists of a set of Java classes 
and a native method DLL that provide ac- 
cess to a video-capture device directly from 
a Java application. It defines an API for con- 
necting and disconnecting from the cam- 
era, adjusting image size, color depth, frame 
rate for video capture, and capturing still 
images. It also includes Swing and AWT 
GUI classes that make it easy to perform 
basic tasks, such as continuous video cap- 
ture, and changing settings through dialogs. 

The central goal in designing the Grab- 
ber API was to make video simple to in- 
corporate into an application, while pro- 
viding the power and extensibility to do 
more complicated tasks. 

The video device is represented abstractly 
by the class com.objectguild.camera. Video- 
Grabber, which locates and connects to the 
video hardware device; performs frame-by- 
frame image capture, implementing all nec- 
essary image format conversions; and can 
return an image or raw pixel data, as a 

“snapshot.” (The source code for a program 
that demonstrates this is available elec- 
tronically; see “Resource Center,” page 5.) 

Before capturing images, the program 
must connect to the camera. This involves 
locating the camera driver, initializing the 
underlying Java and C++ classes, and sig- 
naling the camera to start capturing images. 
VideoGrabber reduces these tasks to a sin- 
gle connect() method, which throws an ex- 
ception if the connection attempt fails. 
The disconnect() method invokes the low- 
level API calls to disconnect from the de- 
vice, and frees memory on the C++ side. 

VideoGrabber also defines methods for 
setting and retrieving image dimensions, 
color depth (bits per pixel), and frame rate. 

Full-motion video in Grabber is auto- 
matic — you can install a specialized Can- 
vas object as an observer of the VideoGrab- 
ber object. When the VideoGrabber is 
connected, with a frame rate > 0, it updates 
the canvas whenever a new frame is cap- 
tured. To take a snapshot, you call the snap- 
shotImage() method, which returns an in- 
stance of java.lang.Image containing the 
latest captured frame. To store a snapshot 
in a database, VideoGrabber provides two 
lower-level snapshot methods: snapshot- 
Pixels(), which returns an int[/ array con- 
taining the raw pixel data for the image, 
and getColorModel, which returns the Col- 
orModel associated with the pixels. 

A common task for applications incor- 
porating live video capture is to open a win- 
dow showing full-motion video camera im- 
ages. To simplify this task, the API includes 
a class called VideoGrabberCanvas. This 
class, in conjunction with VideoGrabber, 
uses the Observer design pattern to allow 
the canvas to update itself automatically, 
whenever a new frame is captured from 
the camera; see Figure 2. 
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The Observer pattern provides a means 
of defining a one-to-many relationship be- 
tween a single observable object and one 
or more observer objects, in which the 
observable object has no specific knowl- 
edge of its observers. The observable ob- 
ject can issue change notifications that are 
interpreted by each observer as it sees fit. 

To implement this pattern, VideoGrab- 
ber extends java.util.Observable. It noti- 
fies its observers when a new frame is cap- 
tured, or when the image dimensions or 
color depth are changed. VideoGrabber- 
Canvas implements the java.util.Observ- 
er interface, which, when notified that the 
image has changed, gets the latest image 
from the VideoGrabber and draws it on 
the canvas. VideoGrabberCanvas’ con- 
structor takes a VideoGrabber object as a 
parameter, and automatically registers itself 
as an observer of the VideoGrabber, see 
Listing One. Thus, once a VideoGrabber- 
Canvas is instantiated and added to an AWT 
or Swing window, all the image updating 
and painting is done automatically, when- 
ever the VideoGrabber is connected. 

Another common task is to prompt users 
to take a snapshot. For example, in an ap- 
plication where users are entering identifi- 
cation information for an individual, it may 
be desirable to allow users to take a snap- 
shot of the individual, to be included in the 


person’s profile. In this case, a dialog would 
come up containing a canvas showing real- 
time video input from the camera. Users 
would click the OK button to take a snap- 
shot and close the dialog. Grabber pro- 
vides AWT and Swing versions of a dia- 
log class containing a self-updating canvas. 
This class has the static method Image Take- 
Picture(Frame, VideoGrabber) that, when 
called, opens a modal dialog and returns an 
image or null, depending on whether users 
took a picture or canceled the operation. 
The combination of a simple yet com- 
prehensive API and GUI support classes 
yields the ability to incorporate video cap- 
ture with few lines of code. Listing Two is 
a button listener that causes a modal dia- 
log to pop up, allowing the user to posi- 
tion the camera, and then take a snapshot. 


Architecture 

Although Grabber for Java initially sup- 
ported only Windows-compatible cam- 
eras, such as the Connectix Color Quick- 
Cam, its clean design allows the addition 
of support for any hardware/operating 
system platform with no coding changes 
for the-applications, and minimal coding 
changes for Grabber itself. To achieve this 
goal, we isolate platform-specific code us- 
ing multiple levels of abstraction, on both 
the Java and C++ sides; see Figure 1. 





The first level of abstraction is the public 
API, defining the class com.objectguild.cam- 
era. VideoGrabber. This is what the applica- 
tions use to connect to the camera. 





Figure 1: Multiple levels of abstraction. 








'The compiler is really great and the s 
my whole life..." 


- Aschwin G. 









/ ") 7 fs +, / 4 
/ eQd tools 


"And by the way 


trick." - Tom R. 


"This compiler has saved me a great deal of time and | like working with it." - Gus M. 


"Thanks again for the 






HI-TECH Software has a ra 


including 8051, PIC, 6805, 280, 68HC11. 


<eep up the good work!" 


and s 


>xcellent quality of your tools 


- Craig N. 


ervicel!l" - Larry O. 





>hnical support." - 


8000, XA. Put some highligh 








Ocie M. 


for most popular microcontrollers, 


in your programs - 


call now for more information or visit —_ site. We have resellers in many couniries. © 





HI-TECH Software. 
7830 Ellis Rd Ste 105 ie 
Melbourne FL _ 
32904 USA 


www. htsott. com 


Fax 407 722 2902 - 











SSS SSeS SSS SS SSS Soe Ses 


The second level of abstraction is a pro- 
tected Java class that sits between the 
VideoGrabber class and native code. This 
class, com.objectguild.camera. VideoDevice, 
is a Java-side representation of a video cam- 
era. It defines the native method interface, 
and is responsible for loading the proper 
native implementation DLL. Different cam- 
era devices or operating systems can be 
specified by subclassing VideoDevice. The 
set of native methods is surprisingly small. 
It includes methods for initializing, con- 
necting, and disconnecting; a method for 
retrieving the contents of the last scanned 
frame into an int// array; and a method for 
retrieving the current color map. 





Figure 2: Use of the Observer pattern for automatic frame-by-frame updates of 


The native method implementations are 
simple delegators to a C++ class called 
VideoCam, which is the third level of ab- 
straction. It defines an abstract interface to 
a generic video camera. Subclasses of 
VideoCam work with specific low-level 
video-capture APIs. The current Grabber 
implementation interfaces with the Video 
for Windows API. The Linux version will 
use a different subclass that talks directly 
to the Connectix QuickCam. 

Since VideoCam defines a low-level API, 
to add support for a different OS or cam- 
era, you need only change the VideoCam 
C++ class, and subclass VideoDevice, over- 
riding the method to load a different DLL. 
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Even though you see only the top-most 
interface (the VideoGrabber class), using 
multiple layers of abstract classes provides 
a great deal of flexibility in adding support 
for different devices and operating systems. 


Video for Windows 

We chose to interface Grabber for Java 
with Video for Windows (VFW) because 
that is the de facto standard for video- 
capture devices on Windows systems. Be- 
cause VideoGrabber talks to Video for 
Windows instead of a lower-level device 
driver, the VideoGrabber can connect to 
any camera build for Windows PCs. 

The ability to support many cameras 
with the same code made VFW the obvi- 
ous choice. However, in accessing VFW 
from Java, we ran into some setbacks re- 
sulting from VFW’s tight integration with 
Windows. Among the problems were: 


e The need to create an invisible window, 
because each VFW function expects a 
handle to a window as a parameter. 

e Event conflicts. The whole program 
would freeze inside a VFW function call 
if the function was invoked while a Java 
button was in the pressed position. 
Some creative use of threads was re- 
quired to work around this problem. 


Once VFW grabs the frame, it invokes 
the callback function. This function must 
convert the image data from Windows’ 
memory image format to Java’s image 
format, and copy the converted data to 
a Java array. 

Image format conversion is complicat- 
ed by several idiosyncrasies of the Win- 
dows image format. In Windows’ 24-bit 
image format, each pixel is represented by 
3 bytes representing the blue, green, and 
red color components (BGR). In Java, the 
byte order for each pixel is Red-Green- 
Blue (RGB). Also, the Windows bitmap 
format stores the image upside-down. The 
first horizontal line in the bitmap corre- 
sponds to the last horizontal line in the 
displayed image. Java expects the bitmap 
to store the image right-side up. Thus, in 
copying the image data from the C array 
to the Java array, the line order must be 
reversed, and the order of the bytes in each 
pixel must be reversed as well. 

Unfortunately, 24-bit BGR is only one 
of several possible Windows bitmap for- 
mats. The Windows image may also be in 
4-, 8-, or 16-bit format, where each pixel 
is represented by an integer offset into a 
color palette. In this case, VideoGrabber 
detects the image format used, and cre- 
ates a corresponding color palette in Java. 


Applications 


To illustrate Grabber, we built a Java ap- 
plication that reads frames from the video 
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camera, takes snapshots, and saves snap- 
shot images in JPEG format. Here we de- 
scribe a simple Java application that dis- 
plays video-camera input, allows users to 
take a snapshot, and saves snapshot images 
in a JPEG file. 

The window layout contains two canvas 
panels, side-by-side. The left panel displays 
the continuously updating image from the 
camera; the right panel displays the latest 
snapshot. Users can connect and discon- 





Figure 3: Image save application. 


code for doing this. To turn the camera 


ber’s settings, take a snapshot, and save the 
snapshot to a JPEG file; see Figure 3. 

The left panel contains a VideoGrab- 
berCanvas, which is initialized with an in- 
stance of VideoGrabber when the appli- 
cation starts up. The right panel, which 
displays the snapshot, is a simple subclass 
of the Swing class JComponent. It paints 
the snapshot image, drawing a white bor- 
der around it to simulate a photograph. 

The left-most round button toggles the 
camera on and off. Listing Three is the 
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on, it calls vc.startup() (where uc refer- 
ences the VideoGrabber instance). This 
method spawns a thread that connects to 
the camera and repeatedly captures frames 
at the default frame rate. To turn the cam- 
era off, vc.shutdown() is called, which 
disconnects from the camera and termi- 
nates the capture thread. 

The middle button spawns com.object- 
guild.camera.ControlPanelFrame, a dialog 
for changing the VideoGrabber’s dimensions, 
color depth, and frame rate; see Listing Four. 

The rightmost button takes a picture, 
by calling vc.snapshot(), and causing the 
snapshot canvas to paint the image re- 
turned by that method; see Listing Five. 

This application demonstrates the ease 
of incorporating video capture using 
Grabber for Java. Listings Three, Four, and 
Five contain virtually all of the camera- 
specific code; most of the development 
effort for this application went into lay- 
ing out the components, and adding 
image-saving capability (using a public- 
domain Java JPEG class). 

Grabber for Java is currently deployed 
at several beta sites. Current developments 
at the time of this writing include Linux 
support, and support for real-time video 
streaming and storage. 
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Listing One ss 


public class VideoGrabberCanvas extends JComponet implements Observer { snapshotButton. setEnabled (true) ; 
VideoGrabber vg; connectButton.setIcon(DisconnectIcon) ; 
hie connectButton.setToolTipText ("Disconnect from camera") ; 
public VideoGrabberCanvas (VideoGrabber camera) { } 
super (); connected = !connected; 
this.vg = camera; } 
camera.addObserver (this) ; class ConnectItemListener implements ItemListener { 
initializeImage() ; public void itemStateChanged (ItemEvent e) { 
} toggleConnect () ; 
i } 
} } 


Listing Two Listing Four 7 


Image img; eae 
bess settingsButton = createButton(SettingsText, SettingsUpIcon, 
// Create an action listener which spawns a modal dialog. SettingsDownIcon, "Change camera settings") ; 
captureButton.addActionListener(new ActionListener() { settingsButton.addActionListener (settingsListener) ; 
public synchronized void actionPerformed(ActionEvent e) { nth 
try { class SettingsListener implements ActionListener { 
// Modal dialog blocks this thread until "picture" is taken. public void actionPerformed (ActionEvent e) { 
img = SnapshotDialog.TakePicture(TestDialog.this, vc); if (control == null) { 
} catch (ConnectFailedException ex) { control = new ControlPanelFrame(vc) ; 
NotifyDialog.showMessageDialog (TestDialog.this, control.pack() ; 
"Unable to connect to camera") ; J 
return; control.setVisible(true) ; 
} } 
imageCanvas.setImage (img) ; } 
imageCanvas. repaint () ; 
| Listing Five 
Hs g 


snapshotButton = createButton(CameraText, CameraUpIcon, CameraDownIcon, 


Listing Three "Take picture") ; 


snapshotButton.addActionListener (snapshotListener) ; 
/** If camera is connected, shuts it down; otherwise, connects to device. */ 


private void toggleConnect () { class SnapshotListener implements ActionListener { 

if (connected) { public void actionPerformed (ActionEvent e) { 
ve. shutdown () ; 
snapshotButton. setEnabled (false) ; image = vc.snapshotImage() ; 
connectButton.setIcon(ConnectIcon) ; snapshotCanvas.setImage (image) ; 
connectButton.setToolTipText ("Connect to camera") ; snapshotCanvas. repaint () ; 
videoCanvas.repaint(); // clear the canvas } 

} else { } 
try { 


ve.startup() ; 
} catch (ConnectFailedException ex) { 
JOptionPane.showMessageDialog(this, "Unable to connect to camera", 
"Bummer", JOptionPane.ERROR_MESSAGE) ; DDJ 
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Anatoly Kotlarsky 


n many real-world scenarios, embedded- 
system developers must squeeze power- 
ful functionality into limited memory 
spaces. To do this, they often have to turn 
away from true multitasking operating sys- 
tems. This is the situation we faced at Auto 
Image ID (http://www .autoimageid.com/), 


the company I work for, when building a — 


fixed-position video bar-code scanner. To 
address the problem, I ended up writing a 
real-time kernel called SPARK, short for 
“Small Portable Adjustable Real-time Ker- 
nel.” We subsequently spun off a startup 
company called Real-Time Microsystems 
(http://www.realtimemicrosystems.com/) to 
continue SPARK development. In this arti- 
cle, I'll present SPARK by describing how I 
used it in the implementation of the Auto 
Image ID ID4100 video bar-code scanner; 
see Figure 1. 

SPARK is a royalty-free, fast, tiny, 
portable real-time kernel. Leaving inter- 
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rupt handling to you, SPARK doesn’t in- 
clude anything that real-time embedded 
systems developers would consider redun- 
dant or platform specific. On the other hand, 
SPARK’s event- and state-oriented nature 
and table-driven, modular architecture 
provide flexibility and reusability. 





The SPARK kernel supports five main 
service functions: 


e sparkExec( ) lets an application start up 
a new task. The task ID is passed to 
the function in the first input parame- 
ter. The third input parameter of the 
function is a pointer to the null-termi- 
nated string of characters, which SPARK 


Real-Time Kernel 


passes on to the task as the task’s com- 
mand line of arguments. (The second 
parameter of the sparkExec() function 
and the only parameter of the spark- 
Kill() function specifies an exit code for 
the completed task. SPARK does not do 
any processing of the task exit codes. 
However, the exit code can be checked 
by the sparkPostProcess() hook if any 
specific action on the particular exit code 
of the particular task must be done.) 


_ SPARK provides nonpreemptive multi- 


tasking. When a new task is initiated, the 
currently running task gets closed. When 
no task is running, the system goes to 
the SPARK idle loop in which it does 
nothing but wait for any event to occur. 
sparkKill() kills the currently running 
task and puts the system into the SPARK 
idle loop. 

sparkSetAppState() sets a new state of 
the application. The state ID is passed 
to this function as an input parameter. 
sparkGetAppState() returns the ID of the 
current state of the application. 
sparkPostEvent() lets an application 
trigger a reaction on the particular 
event. The event ID is passed to the 
function in the first input parameter. 
The second input parameter of this 
function is a pointer to the data that 
can be passed on to the event-pro- 
cessing function called an “event func- 
tion.” Un the sparkPostEvent() function 
prototype, the pointer to the data is de- 
fined as a pointer to void. It’s up to 
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(continued from page 80) 
you to define the actual structure of 
data being passed to the particular 
event function.) 


The logic of the sparkPostkvent() func- 
tion is simple. First, it tries to find and 
call the event function responsible for 
processing the event at the current state 
of the application. If the event function 
is not provided, then the check is made 
whether or not the event is system wide. 
If it is, the event function responsible 
for processing the event at any state of 
the application is called; otherwise, the 
event is ignored and control is passed 
back to the interrupted task. When 
found and called, the event function may 
decide either to start a new task by call- 


ing sparkExec(), kill the current task by 
calling sparkKill(), or do something (or 
nothing), and return to the current task. 

The tasks, states, and events are re- 
ferred to using their unique respective 
IDs. This adds flexibility to the system. 
All associations between the tasks, 
states, events, and called functions are 
made at the application-configuration 
level and can easily be reviewed or 
changed. Moreover, SPARK treats the 
task and state IDs as direct indices to 
the application-specific table of tasks 
and table of states, respectively, so that 
there is no time wasted in looking up the 
task function to start when sparkExec() 
is called or the list of events and their 
corresponding event functions in the table 
of states when sparkPostEvent() is called. 
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It makes the system reaction to the events 
and task switching extremely fast. 

SPARK also provides a set of callback 
functions, called “SPARK hooks,” which 
are called by the kernel when the state 
of the system changes. By overriding the 
hooks, an application can change the de- 
fault behavior of the system and take full 
control over transitions of the system and 
application states. 

Some of the hooks are: 


sparkStartup(), called on system start- 
up. You can override the hook to do any 
application-specific initializations here. 
sparkReadyToRunt( ), called to get the 
application confirmation before start- 
ing a new task. The function should 
return 1 if it’s okay to close the cur- 
rently running task and start the new 
one; otherwise, the function returns 0. 
The default of this hook does nothing 
and returns 1. You can override the 
hook to perform application-specific 
checking prior to allowing SPARK to 
start a new task. 

sparkPreProcess(), called just before a 
new task is started. You can override 
the hook to do any application-specif- 
ic initializations before the new task is 
started. 

sparkPostProcess( ), called just after the 
new task is finished. You can override 
the hook to perform any application- 
specific actions after the task is finished. 
sparkldle(), called between idle loops. 
You can override the hook to perform 
application-specific actions when the 
system is idle. 
sparkSetIntSystemStatus(), called when- 
ever the interrupt system should be 
turned on or off. You should override 
the hook to provide a proper manipu- 
lation with the platform-specific inter- 
rupt system. 

sparkGetIntSystemStatus(), called when- 
ever the current status of the interrupt 
system is inquired. You should override 
the hook to provide a proper manipu- 
lation with the platform-specific inter- 
rupt system. 


Developing SPARK-based Applications 
When developing SPARK-based applica- 
tions, the most important thing you need 
ee Ts 





Figure 1: The Auto Image ID ID4100 
video bar-code scanner. 
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(continued from page 82) 

to do is determine the tasks, states, and 
events of the embedded system and de- 
scribe them in your application-config- 
uration file using the SPARK configura- 
tion macrocommands. You then use 
SPARKCNF, the SPARK Configuration 
Tool, to generate the application con- 
figuration .c and .h files where the ta- 
bles used by SPARK are automatically 
created. The command line for running 
SPARKCNF on UNIX or Windows host 
computers is: 


sparkcnf.exe -i inp_file -e event_file -o 
outp_file 


where inp_ file is an application-con- 
figuration file written using the SPARK 
configuration macrocommands; event_ 


file is the name of the file (normally, 
with the extension .h) that contains def- 
initions of all event IDs specified in the 
inp_ file, and outp_file is the root name 
of the generated .c and .h files. 

Listing One is excerpted from the 
ID4100 fixed-position video bar-code 
scanner’s application-configuration file. 
Although it’s only a fragment of the real 
configuration file, it demonstrates the most 
important elements of any SPARK-based 
application. 

In the TASKS section, each task is de- 
scribed by its ID followed by the task func- 
tion name. For example, the task that pro- 
vides the scanner main menu to users has 
the ID TASK_MAIN_MENU, and its func- 
tion name is taskMainMenu. 

In the STATES section, each application 
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state is described by its ID and, optionally, 
by the immediately following EVENTS sec- 
tion. In the EVENTS section, every event 
that should be processed at the specified 
state of the application is described by the 
event ID followed by the event-function 
name. Similarly, in the SYSTEM_WIDE_ 
EVENTS section, the events that may apply 
to any state of the application are described. 
For example, if the video input request 
event (SYSEVENT_VIDEOINP_REQUEST) 
is posted when the system is idle (applica- 
tion state STATE_ IDLE), the function evRun- 
Scanner( ) gets called. If the same event is 
posted when the system is already in the 
process of getting video data (application 
state STATE_GET_VIDEO) or decoding data 
(application state STATE_DECODE), the 
function evRerunScanner() gets called, and 
so forth. 

SPARKCNF automatically generates the 
actual values of the task and state IDs in 
the outp_file.h file. However, you need 
to provide the values of the event IDs in 
event_ file. (The event IDs can also be 
generated automatically, and event_ file 
is optional. I found it more helpful for 
complex embedded systems such as the 
video bar-code scanner to have a sepa- 
rate header file that properly enumerates 
the system events. This way, I can in- 
clude this file in the source code of any 
reusable and separately compilable lev- 
el of the application, BIOS for example, 
and post proper events from there when 
necessary without having to rely on gen- 
erated event IDs.) Listing Two shows the 
definitions of event IDs used by the 
1D4100 video bar-code scanner. There 
are several event IDs that are defined in- 
ternally by SPARK in spark.h (EVENT_ 
POWER_UP is one of them). The appli- 
cation-specific event IDs begin from the 
value LAST_SPARK_SYSTEM_EVENT + 1. 

For each task described in the config- 
uration file, you should provide the task 
function, which must have the prototype: 


typedef EXIT_CODE (*TASK)(char 
* cmd_line); 


where cmd_line is a pointer to the null- 
terminated string of parameters passed to 
the task via sparkExec() call. 

For each event described in the con- 
figuration file, you should provide the 
event function, which must have the pro- 


totype: 
typedef int (*EVENT_FUNC)(void * pPar); 


where pPar is a pointer to the data that 
can be passed to the event function via 
the sparkPostEvent() call. The event func- 
tion returns 1 to signal that the event has 
been processed successfully and the cor- 
responding event function from the SYS- 
TEM_WIDE_EVENTS should not be 
called Gf this event has an entry in the 
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SYSTEM_WIDE_EVENTS section as well); 
otherwise, the event function should re- 
turn 0. This behavior lets you provide 
the default action for certain system-wide 
events and override the default action or 
simply adjust it, depending on the state 
of the application. 

To complete an application, you write 
all other application-specific functions, 
including interrupt-handling routines. 
Use the SPARK service functions to set 
application states, to post events, to start 
tasks, and so on. If necessary, you can 
override any SPARK hook to provide 
proper functionality of your embedded 
system. Compile the generated config- 
uration .c file along with all of your 
application-specific modules and over- 
ridden SPARK hooks and link them with 
the SPARK library (which should be built 
for the platform of your embedded sys- 
tem) and the C run-time library that sup- 
ports your CPU. This builds your appli- 
cation. Use the debugger and other tools 
available to debug your application and, 
perhaps, burn the ROM to make the ap- 
plication available for use in your em- 
bedded system. 


A Closer Look at the 

Bar-Code Scanner Application 

To better illustrate how a real-time ap- 
plication can be built with SPARK, Ill 
take a closer look at the bar-code scan- 
ner application. 

When the system starts up, SPARK fires 
the EVENT_POWER_UP event. This is 
described as a system-wide event in the 
configuration file (see Listing One) and 
is processed by evPowerUp() (see List- 
ing Three). This function sets an appli- 
cation state STATE_POWER_UP, does all 
necessary ID4100 scanner initializations 
by calling 1D4100SystemInit(), and puts 
the system into an idle loop by calling 
sparkKill(O). 

In between the idle loops, the spark- 
Idle) hook is called. The bar-code scan- 
ner application overrides this hook to set 
an application state to STATE_IDLE. 

When an object containing a bar-code 
is positioned at the field of view of the 
scanner, the scanner’s hardware gener- 
ates a trigger interrupt. The trigger in- 
terrupt handling routine then posts the 
SYS-EVENT_ VIDEOINP_REQUEST event 
by calling sparkPostEvent(SYSEVENT_VID- 
EOINP_REQUEST, (void *) QO). At the 
STATE_IDLE application state, this event 
is processed by evRunScanner(), which 
starts the video input and decoding pro- 
cess by calling sparkExec(TASK_RUN_ 
SCANNER, O, (char *) 0). While getting 
the video data, the application is in the 
STATE_GET_VIDEO state. When in the 
decoding process, the application state 
is set to STATE_DECODE. Upon return 
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from the decoding task, the system 
comes back to the idle loop. 

If the SYSEVENT. VIDEOINP_RE- 
QUEST event occurs while the applica- 
tion is in the STATE_GET_VIDEO or the 
STATE_DECODE state (meaning a real- 
time error situation), the event is pro- 
cessed by evRerunScanner(), which 
sends a real-time error message to the 
user prior to initiating the new video in- 
put and decoding process. 

The decoding process may also be in- 
terrupted by the SYSEVENT_READ_ 
TIMEOUT event generated by the sys- 
tem timer when the time allowed for de- 
coding expires. The event function ev- 
Scannerkead-TimeOut(), called when 
the event occurs, sends a “No Read” mes- 
sage to the user and kills the decoding 


0, EEE ee, 


task, putting the system back into the 
idle loop. 

Of course, this is only a fractional part 
of the real bar-code scanner functional- 
ity. The SYSEVENT_NOWAIT_IN system- 
wide event is fired by the serial port 
interrupt-handling routine when it re- 
ceives the data not expected by the ap- 
plication. The SYS-EVENT_BREAK event 
is triggered when a task should be 
stopped due to the user request. This al- 
lows the bar-code scanner to switch back 
and forth from the online running mode 
to the offline menu mode. 
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Listing One 


TASKS 
{ 
TASK_MAIN_MENU: taskMainMenu 
TASK_HELP: taskHelp 
TASK_LIST_DECODERS: taskListDecoders 
TASK_ENABLE_DECODER: taskEnableDecoder 
TASK_DISABLE_DECODER: taskDisableDecoder 
TASK_DECODE: taskDecode 
TASK_RUN_SCANNER: taskRunScanner 
} 
STATES 
{ 
STATE_POWER_UP 
STATE_IDLE 
EVENTS 
{ 
SYSEVENT_VIDEOINP_REQUEST: evRunScanner 
} 
STATE_GET_VIDEO 
EVENTS 
{ 
SYSEVENT_VIDEOINP_REQUEST: evRerunScanner 
} 
STATE_DECODE 
EVENTS 
( 
SYSEVENT_VIDEOINP_REQUEST: evRerunScanner 
SYSEVENT_READ_TIMEOUT: evScannerReadTimeOut 
} 
STATE_MMENU 
} 
SYSTEM_WIDE_EVENTS 
{ 
EVENT_POWER_UP: evPowerUp 
SYSEVENT_NOWAIT_IN: evNoWaitIn 
SYSEVENT_BREAK: evBreak 
} 


e e 
Listing Two 
#ifndef SYSEVENT_H 
#define SYSEVENT_H 


#include "spark.h" 
#define FIRST_SYSEVENT LAST_SPARK_SYSTEM_EVENT + 1 
enum { 
SYSEVENT_NOWAIT_IN = FIRST_SYSEVENT, 
SYSEVENT_BREAK,, 
SYSEVENT_VIDEOINP_REQUEST, 
SYSEVENT_READ_TIMEOUT, 
LAST_SYSEVENT = SYSEVENT_READ_TIMEOUT 
2 
#tendif /* SYSEVENT_H */ 


e e 
Listing Three 
#include "spark.h" 
#include "sparkhks.h" 
#include "id4100.h" 
ert TTT OCI oCI Lier ee reer er rrrre rr er rerr ere errr rereeterereter terre Tf 
EVENT_POWER_UP event-function. 


EES ASSIS ISCAS ASSIA ARIS ICRI OR aA I Ra ACR a ak kk a ak kk ak ak ak / 
int 
evPowerUp(void * par) 


{ 
sparkSetAppState (STATE_POWER_UP) ; 
ID410@SystemInit () ; 
sparkKill(@) ; 
return 1; 
} 


SERCO GCC GGG IG ig a kk kkk ak kkk ok fk ak ok kok ok ok ak kak a ok \ 


This is an override of the sparkIdle() SPARK hook. 
(BARA GASES GSAS GIA RG ACI I A I a 1 a 3 ok ko ok ok a ok a ok ok ak ok ok ak ok / 


void 
sparkIdle(int first_call) 


{ 
if (first_call) 
{ 
sparkSetAppState(STATE_IDLE) ; 
} 
} 


DRCOG CGC CCGG GGG GRR GK kk \ 
This function handles the "Video Input Request" event when the 


system is in the idle loop. 
| FESS SISA ICIS ICICI AR ICICI aR a 21 1 1 4 21 5 3 2k 5 22 oR Cok ok CAR I COR oR AC CAO oka aca ok ak / 


int 
evRunScanner(void * par) 


{ 
sparkExec (TASK_RUN_SCANNER, @, (char *) @); 
return 1; 

} 

EXIT_CODE 

taskRunScanner(char * cmd_line) 

{ 
sparkSetAppState (STATE_GET_VIDEO) ; 
ID410@GetVideo(); 
sparkExec(TASK_DECODE, @, (char *) @); 
return @; 

} 


DERG OSG IGA CG AGOGO IG GASSER KE KC | 
This function handles the "Video Input Request" event when the 


system is busy decoding the previously entered data. 
(FERGIE AIGA G GAGA SICC IG ISG IGRI A RE kaka kak ak kkk / 


int 
evRerunScanner(void * par) 


{ 
ID410ORTEMsg () ; 
sparkExec (TASK_RUN_SCANNER, @, (char *) @); 
return 1; 

} 


DOCS S ACOSO SACI SGI GGG aK a kak ak sk ak ak kkk ak 
This function handles the "Read Time Out" event when the 
system is decoding the previously entered data. 
Nittrircrororet rere reer eter etre rere Pier ere Ter eter er errr rrerere se 
int 
evScannerReadTimeOut (void * par) 


{ 
ID410@NoReadMsg () ; 
sparkKill (ERR_READ_TIMEOUT) ; 
return 1; 

} 
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Automated ‘Testing 
for Web Applications 





A cumbersome 
Job made easy 





M. Selvakumar 


eb-user interfaces (WUIs) are now 
as common as the familiar GUI 
and command-line interfaces. 
WUIs can be constructed in many 
ways, including HTML, Java applets, and 
plug-ins. Of these, HTML is the most wide- 
ly adopted approach. Furthermore, HTML 
can often be combined with Javascript to 
provide additional functionality. Web 
search engines such as HotBot, AltaVista, 
and Infoseek are examples of WUIs based 
on HTML. 

As with GUIs, WUI testing is a criti- 
cal part of the development process. 
Consequently, the need to formalize and 
automate WUI testing is paramount. In 
this article, I'll present an approach for 
automated WUI testing that I imple- 
mented and deployed for a WUI devel- 
oped on a commercially available data- 
management application. This technique 
is based on HTML, Javascript, and CGI. 
The implementation environment for this 
technique was based on Netscape Com- 





The author is an engineer for Texas In- 
struments India Ltd. He can be contacted 
at selvak@india.ti.com. 
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municator 4.04 (browser) and Apache 
1.2 (server). 


Existing Testing Techniques 
There are three widely used techniques 
for testing CGI-based applications: 


e Simulating a browser, to test the ap- 
plication logic performed by CGI pro- 
grams. This is done by developing a 





tester program that generates an HTTP 
request (as if it came from a browser) 
and sends it to the web server. The 
server in turn invokes the respective 
CGI program and passes the result 
back to the tester program. It then 
checks and compares the output re- 
ceived. The disadvantages to this ap- 
proach are: The browser’s rendering 








of output HTML documents is not test- 
ed; client-side processing (done using 
Javascript) is not tested; and the test 
program and application have to com- 
municate with each other using the 
HTTP protocol. 

e Adding test structures to the application 
code, which involves two interfaces — 
one for interacting with the web server, 
and another for interacting with the tester. 
This lets the tester program interact with 
the application using the simpler, non- 
HTTP protocol. To test a login feature, 
for example, the tester program sends the 
values Userld and Password as command- 
line arguments to the program instead of 
generating an HTTP request. Hence, de- 
veloping a tester program becomes eas- 
ier. But the other disadvantages— no test- 
ing of browser rendering and client-side 
processing— remain. Also, the simula- 
tion approach introduces the disadvan- 
tage of embedding structures in the ap- 
plication to handle test-mode interaction. 

e Manual testing, which checks all com- 
ponents of the system, but the disad- 
vantages are obvious— it is cumbersome, 
requires lots of time, and introduces the 
possibility of human errors. 


Browser Representation 

of HTML Documents 

When HTML documents are loaded into 
browsers, they sometimes create a num- 
ber of Javascript objects for different 
components of the document; for ex- 
ample, one image object is created for 
each <img> tag in the HTML source. The 
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(continued from page 88) 
created objects have attributes and meth- 
ods. Attributes denote the properties of 
an HTML component, while methods, 
when invoked, perform relevant opera- 
tions on the object, such as a Text object. 
One attribute of a Text object is value, de- 
noting the current value entered in the form. 
focus is a method that, when invoked, sets 
focus to the Text field in the browser screen. 
Similarly, browsers create objects, then 
initialize and arrange them in a hierarchy 
that reflects the structure of an HTML page 
itself. In Figure 1, the browser represents a 
given HTML document. The sample HTML 
document contains one image, one form, 
two Text objects, and one submit compo- 


<html> 
<head> 
<title> A Sample HTML Doc. </title> 
</head> 
<body> 
<img sro=personjpg name=photo> 
<form name=person> 
Name <input type=text name=persName> 
Email Id <input typetext name=persid> 
<input typ e=submit value="Enter” > 
</form> 


</body> 
</html> 


HTML document 


invoke the 
application 


nent. All these components get mapped to 
Javascript objects inside the browser and 
are arranged in a hierarchy. All HTML doc- 
uments loaded in the browser are inter- 
nally represented in the same way. 


Accessing Browser’s HTML Objects 
HTML documents that originate from the 
same server can access objects of other 
documents once they’re loaded in the 
browser. This can be done using Javascript. 
For example, in Figure 2, Pages A and 
B are HTML documents originating from 
the same server. Page A is a simple HTML 
document with a form that contains a 
component named “Industry,” which is a 
text object. Page B contains Javascript 


(Window Object) 





Modify a form 
variable, thus 
simulate a 
user event 





Figure 2: Accessing objects of another HTML document. 
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functions to access the Page A objects. It 
contains three buttons attached to three 
different Javascript functions. Clicking on 
the first button (Open Page A) loads Page 
A on a new browser window and receives 
a handle to the newly created window. 
Clicking on the second button (Set a Form 
Variable [Industry]) accesses the Page A 
object hierarchy and sets a value (Soft- 
ware) to its Text object. Finally, clicking 
on the third button (Close Page A) closes 
the browser window that held Page A. Thus, 
Page B simulates some user events on Page 
A by directly manipulating Page A’s objects. 
For details, see Listings One and Two im- 
plement Pages A and B, respectively. 


Automating WUI Testing 

A WUI to an application can sequentially 
generate HTML documents. In a typical 
database search scenario, for instance: 


1. Users click on a Search button. 
2. Get back a keywords form. 

3. Users fill in and submit the form. 
4, Get results. 


In this case, two HTML documents (key- 
words and output) are generated and two 
input events (click on Search and enter 
keywords) take place. Additional features 
might result in the generation of many 
more HTML documents. 

Manually testing this feature might in- 
volve many iterations of the database 
search steps each time with different test 
cases (keyword values). However, testing 
can be automated by writing a Javascript 
function that simulates user events. The 
steps to this might be: 


1.Simulate pressing the Search button. 

2.Wait until the HTML-document-with- 
keywords form is received. 

3.Supply keyword values and submit the 
form. 

4, Wait until the HTML-document-with- 
search-results is received. 

5.Check for the success of the search op- 
eration. 


Once testing is automated, it can be 
deployed many times with different test 
cases. The search tester function takes 
keyword values as parameters and tests 
using those values. Figure 3 shows one 





Example 1: Format of test-cases files. 
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(continued from page 90) 

way to automate WUI testing. The test 
engine is the core component that au- 
tomates the testing. As Listing Four 
shows, it is simply an HTML page con- 
taining tester functions, a main function, 
and test cases. 


Tester functions are used for testing 
individual features of the application. Ev- 
ery feature that needs to be tested is as- 
sociated with a specific tester function. 
All tester functions are written in 
Javascript. Each tester function takes all 
the input values required for a feature, 





Example 2: Test case for a database search. 








Ul Test Automation of Web Applications 





1. Generate ‘Test Engine’ 
with Tester functions 
and embedded Testoases 


2. Invoke the web application 


3. Invoke a feature and 
apply a testcase 


4. Wait until the response 
is received for the invoked 
feature 


§. Check for the success of 
operation 


3-5 for different 


Figure 3: Automating UI testing of a web application. 
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Figure 4: Automated testing of an online shop on the Web. 


Submit the order 
and wait until it 








applies them to the application screen by 
manipulating its objects, and submits the 
request. It then waits until the response for 
the given request is received and rendered 
by the browser. After output is received, it 
checks for the success of the operation by 
accessing the value of a fixed hidden vari- 
able in the application page. In case of un- 
expected results, it takes appropriate ac- 
tions, such as suspending testing and 
informing users. 

If the feature involves multiple 
screens, it takes care of the synchro- 
nization by waiting until a response is 
received before proceeding with the 
next HTML document. 

The main tester function manages the 
testing and determines the sequence in 
which features should be invoked and se- 
lects different test cases for each feature. 

The test engine maintains the test cas- 
es to be applied in Javascript data struc- 
tures. The test engine is generated by 
test engine generators, which take the 
tester functions, read test cases from 
files, and embed them in Javascript data 
structures. Example 1 shows the format 
of test cases files. JD identifies the test 
case, FEATURE denotes which feature to 
invoke, and KEY and VALUE pairs de- 
note the expected input variables and 
values for them. For a database search 
scenario, the test case file might look 
like Example 2. 

After updating test case files, the test 
engine generator needs to be run again 
so that the new test cases get loaded in 
the test engine. 


Testing a Sample Application 

For illustration, consider a hypothetical 
company called “ABC Audio” that sells au- 
dio products. ABC has enhanced its home- 
page to take online orders from customers. 
The WUI lists the available brands and 
price details. Customers select brands, 
specify quantity, method of payment, and 
order. As Figure 4 illustrates, the WUI cal- 
culates total prices and informs the cus- 
tomer. This processing is straightforward, 
but real-life applications involve much 
complex processing. 

The WUI can be tested using a tester 
that implements the technique described 
here. The tester generates orders by get- 
ting realistic random numbers for each of 
the parameters (Quantity, Payment method, 
and so on), applies it to the online shop, 
and waits for the order to be processed. 
Once done, it goes back to the online shop 
page and proceeds with the next order. It 
repeats the testing to the specified num- 
ber of times. Listings Three and Four pre- 
sent the HTML source for the online shop 
and online shop tester implementations, 
respectively. 

(continued on page 95) 
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Extensions 

There are several ways in which you can 
enhance or otherwise extend the test en- 
gine I present here. These enhancements 
include: 


_ Changing The Way 

: The World Develops 

para ena ag ay .t_ Real-Time 3D 
ing. It can access the output from the |  -. G : 

we raphics 


Browsers create 
objects, then 
initialize and 

arrange them in a 


awh WWW. VISKIT.com 


Free Vampire Demo Available! 5 


i 





application for the given test case, send 
it to another CGI program, which will 
compare the new output with old out- 
put, and store the results. 

e Report generation. The test engine can 
be made to check for the success of the 





operation for each test case applied and ¢ 
generate a test report. The application a » RATION AL. 
can inform the test engine whether the ) yy SOFTWARE 


operation was successful by setting a : r vs a 
status variable in the output screen. The 4 i a ae Ecucahon Fonner 
test engine can access the status vari- . — = | 
able to check the result. oo . T™ 
e Record and play user events. This tech- | | 4 ver | CLEARCASE 
nique still requires writing tester func- | | ) : . . CLEAR DDTS”™ 
tions for each of the features and gen- | | —" % 
erating test cases. This overhead can 
be eliminated if user events on an 
HTML page can be recorded to gen- 
erate tester functions and test cases au- 
tomatically. 


Conclusion 

The automated testing technique pre- 
sented here addresses most of the dis- 
advantages of simulation and manual 





testing. Still, it does have an overhead \ a , : 

cost of writing tester functions for each | | ¥ Ci 

application feature. But it is a one-time ws : 

effort, and the benefit achieved by this CMI offers a broad range of consulting services Configuration Management, Inc. 

technique will be considerable for web- in Software Configuration Management, 4-800-550-5058 

based applications that are released fre- Requirements Management, and software 140 Broad St, Red Bank, NJ 07701 
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Listing One 


<html> 
Ne ala a aa de ata a re a ahaha ac 
Subject: pageA.html 
To : Doctor Dobb's Journal 
Section: Internet Programming 
From : M Selvakumar (selvak@india.ti.com) 
Copyright, 1998, M Selvakumar, Texas Instruments Inc. 
weeweoew www em ew we www ww ww ww ww ewww www ww ww www ow ww ww ow ee wm mmm mew ee eee Se > 
<body > 
<center> 
<font size=5> Page A </font><br><hr> 
</center> 
<form method=post> 
Industry : <input type=text name="industry"> 
</form> 
<hr> 
</body> 
</html> 
e e 
Listing Two 
<html> 
CSS Sasa SaaS es sas asa ns aaa see rane aS Sse S5s55 
Subject: pageB. html 
To : Doctor Dobb's Journal 
Section: Internet Programming 
From : M Selvakumar (selvak@india.ti.com) 
Copyright, 1998, M Selvakumar, Texas Instruments Inc. 
Ce ee > 
<head> 
<script language=Javascript> 
var Win; 
| | Mioratatetatenetataatataainiaeanraanatmeaaaaanataameie 
// Open 'Page A' 
function init () { 
Win = window.open("http://<your domain>/pageA.htm1") ; 
} 
[| mnmeen name nanan nn nnen nc en sen sm ae enassesa 
// Modify 'Page A' 
function modifyFormA () { 
// Set a value to form variable - Industry 
Win.document.forms[@].industry.value = "Software"; 
} 
[| sasmnemea canna ceannes sansa censnnanewss 
// Close 'Page A' 
function term () { Win.close(); } 
</script> 
</head> 
<body > 
<center> 
<font size=5> Page B </font><br><hr> 
</center> 
<form name=formB action=""> 
<input type=button value="Open PageA" onClick="init();" ><br> 
<input type=button value="Set a Form Variable (Industry)" 
onClick="modifyFormA();" ><br> 
<input type=button value="Close PageA" onClick="term() ;" ><br> 
</form> 
<hr> 
</body> 
</html> 
® e 
Listing Three 
<html> 
(a+ --2--------------------------------------------------------- 
Subject: Online Shop 
To : Doctor Dobb's Journal 
Section: Internet Programming 
From : M Selvakumar (selvak@india.ti.com) 
Copyright, 1998, M Selvakumar, Texas Instruments Inc. 
wee ew aw ow a wo oe ee > 
<head> 
<title> ABC Audio</title> 
</head> 
<body bgcolor="#ffffff"> 
<center> 


<font size=6>ABC Audio</font><br> 

<font size=5>0On-line Shopping! </font><br> 

</center> 

<br><hr> 

<font size=5>Personal Audio<font> 

<br><br> 

<font size=4>Please select the items & quantities</font> 

<form method=post action="http://<ABC Audio Domain Name>/cgi- 
bin/shop/onlineShop.p1"> 

<blockquote> 


<table border=1> 
<tr bgcolor=yellow><th><th>Brand<th>Model Name<th>Price<th>Quantity 
<tr><td><input type=checkbox name=brands value="B1"> 
<td> Sony 
<td> SA-167@ 
<td> 208 
<td><input type=text size=5 name="Qi"> 
<tr><td><input type=checkbox name=brands value="B2"> 
<td> Aiwa 
<td> AI-W660 
<td> 22$ 
<td><input type=text size=5 name="Q2"> 
<tr><td><input type=checkbox name=brands value="B3"> 
<td> Panasonic 
<td> PA-X1256 
<td> 21$ 
<td><input type=text size=5 name="Q3"> 
</table> 
</blockquote> 
<br> 


<font size=5> Payment Method<font> 
<blockquote> 
<input type=radio name=payment value="P1">Visa 
<input type=radio name=payment value="P2">Master Card 
<input type=radio name=payment value="P3">American Express 
</blockquote> 
<hr> 
<input type=submit value="Order Now!"> 
<input type=reset value="Clear"> 


</form> 
</body> 
</html> 


Listing Four 
<html> 


Subject: Online Shop Tester Program 

To : Doctor Dobb's Journal 

Section: Internet Programming 

From  : M Selvakumar (selvak@india.ti.com) 
Copyright, 1998, M Selvakumar, Texas Instruments Inc. 


<head> 
<title> ABC Audio On-line Shop Tester </title> 
<script> 
var Win, RepWin; 
var count=0, noOfTests; 
// TEST ENGINE -- Contains Main Function, Tester Functions and 
// randomly generated Testcases 


// Main Function - Test Shop 
// 1. Generate and apply a random order 
// 2. Wait for the response 
Hy 3. Getback the Shop for next test 
function test( task ) { 

if ( task == 1 ){ 

submitAShoppingRegq() ; 
task = 2; 
} 
if (task == 2) { 
var ret = goBack(); 
if ( ret == false ) { 
setTimeout ("test(2)", 2000); 
return false; 
} 

} 

counttt; 

// Move to next order 

if ( count <= document.forms[@].testCount.value) { 

setTimeout ("test(1)", 580); 


return true; 


function init() { 
Win = window.open("http://<ABC Audio Domain Name>/shop.html",'', 
'width=50@,height=600') ; 


function submitAShoppingReq() { 
var tmp = Math.round (Math. random() *1@) ; 
var payment = tmp%3; 
Win. document. forms [@] .reset(); 
Win. document. forms [@] .brands[@] .checked=true; 
Win. document. forms [9] .brands[1] .checked=true; 
Win. document. forms [@] .brands [2] .checked=true; 
Win. document. forms [@] .Q1.value= Math. round (Math. random() #3@9) ; 
Win. document. forms [@] .Q2. value= Math. round (Math. random() *2@9) ; 
Win. document. forms [@] .Q3.value= Math. round (Math. random() *199) ; 
Win.document. forms [9] . payment [payment] .checked = true; 
Win. document. forms [@] . submit () ; 


function term() { Win.close(); } 
function goBack() { 
// Check if the response is received 
if ( typeof(Win.opOver) == 'undefined' ) { 
return false; 


} 
// Check for success of the operation 
if ( Win.document.forms[@].result.value != "Success" ) { 


alert("Operation not Successful") ; 


Win. history.go(-1); 
return true; 
J 
</script> 
</head> 
<body bgcolor="#fff£ffE"> 
<center> 
<font size=5>ABC Audio On-line Shop </font><br> 
<font size=4>Web Interface Tester</font><br> 
</center> 
<br><hr> 
<form name=Tester onSubmit="parent.test()"> 
<input type=button value="Open the Shop!" onClick="init() ;"><br> 
No. of Tests 
<input type=text name=testCount size=5 value=1> 
<input type=button value="Test!" onClick="count=1; test(1);"><br> 
<input type=button value="Close the Shop!" onClick="term() ;"><br> 
<br><hr> 
</form> 
</center> 
</body> 
</html> 
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ven though source-code version con- 
trol is a critical part of project man- 
agement, it remains one of the most 
neglected aspects of the development 
process. Poorly maintained version con- 
trol can itself create bugs that jeopardize 
project schedules and software quality. 
The good news is that most of these prob- 
lems can be avoided by properly using 
source-code version control software. 
There are a number of commercial and 
free source-code version-control packages 
available, including: 


e Intersolv PVCS Version Manager (http:// 
www.microfocus.com/products/pvcs.htm). 

e Microsoft Visual SourceSafe (http://msdn 
-microsoft.com/ssafe/). 

e Burton Systems TLIB (http://www 
.burtonsys.com/). 

e Source Code Control System (available 
on UNIX systems). 

e Rational ClearCase (http://www.rational 
.com/products/ccmbu/). 

e Concurrent Version System (CVS) (http:// 
www.cyclic.com/). 

e Project Revision Control System (PRCS) 
(http://xcf. berkeley.edu/~macd/pres.html). 


For a more complete list of commercially 
available packages, see http://www.loria 
fr/~molli/cem/cm-FAQ/cm-tools-7.html. A 
list of freely available version-control al- 
ternatives (primarily for UNIX) can also 
be found at http://www .loria.fr/~molli/ 
cm/cm-FAQ/cm-tools-6. html. 

When it comes down to it, however, 
which version-control software package 


Aspi manages projects that require ex- 
tensive coordination in software, hard- 
ware, and mechanical development. His 
area of interest is in developing processes 
and methodologies for team-based pro- 
ject management. You can contact him 
at ahavewala@hotmail.com. 
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you use is secondary. What is really im- 
portant is how you use it. Consequent- 
ly, in this article, Pll describe potential 
problems and investigate practices that 
will help avoid problems, no matter 
which source-code version-control pack- 
age you use. 


The Need for Version Control 

One of the central activities in any pro- 
ject is the constant addition, deletion, and 
modification of source code. In a sim- 
plistic scenario, individual developers 
work on a given set of files and never 
have a reason to modify files outside a giv- 
en set. In today’s world of substantial code 
reuse brought about by object-oriented 
programming, however, this simplistic 
scenario is unrealistic. A more common 
scenario involves several developers si- 
multaneously modifying shared source 
code. If a developer is making changes 
to a particular file, other developers 
shouldn’t be making changes to it. 
Source-code version control, therefore, 
is a set of working rules for code shar- 
ing that lets developers modify files in an 
exclusive way. 

In addition to coordinating access, 
whenever a developer makes modifica- 
tions to a file, version control maintains 
a separate version of each file in a 
database. Each version can then be ref- 
erenced individually if desired. Thus, it 
maintains a journal of changes made to 
each file in the source code. At any giv- 
en time, a set of files may have several 
different versions in the database (de- 
pending on how many times the files 
were modified). Version control lets you 
specify a label for a set of files, which 
marks the current version of each file at 
a given point in time. Such a label can 
then be used to retrieve the set of files 
that were current at the time the label 
was assigned. 


How and why it can save your project 


How Version Control is Used 

Version control can be realized via one of 
many commercially available software 
packages. Version control maintains copies 
of all the files comprising the source code 
of a given project in a version-control 
database. This database is usually read- 
able only by the version-control software 
front end. To extract a file from the 
database, you perform a Get or Refresh 
operation to extract the latest version of 
a file from the database and copy it onto 
your hard disk. Where the file is copied 
depends on the working directory setup 
by users. Working directories can be as- 
sociated with individual files or an entire 
project. 

When you are ready to modify a file, 
you lock it. This marks the file “in-use” 
in the version-control database and lets 
you modify it in an exclusive way. When 
you finalize changes, you “check-in” the 
file. This causes the file to be copied 
back into the version-control database. 
The lock on the file is removed, thus 
making it available to other developers 
who may want to modify it. Some version- 
control systems combine the Get/Refresh 
and Lock operations into one operation 
called “check-out.” Figure 1 illustrates 
the state transitions for a file under ver- 
sion control. 

In addition to developers, administra- 
tors of version-control systems perform 
special duties to ensure version control 
functions smoothly and the integrity of the 
database remains intact. Administrators 
have rights to perform maintenance op- 
erations off-limits to developers. Figure 2 
is a use-case diagram that shows the func- 
tionality exported by version control used 
by different types of users. The diagram 
illustrates the partitioning of functionality 
I recommend here. Your version-control 
software may not necessarily enforce such 
a partitioning of use cases. 
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Designing a Directory Layout 

One of the first tasks you should assign 
when starting a new version-control 
database is designing a directory layout. 
The question behind this is simple but im- 
portant: How will your modules be orga- 
nized in their source form? Designing a 
directory structure for your modules is im- 
portant for several reasons. They force you 
to think in terms of the implementation 
of your design. I refer to such tasks as 
“designing your implementation.” This 
sounds like a contradiction, but it makes 
sense. If you are writing a device driver, 
for example, you can factor this into sev- 
eral implementation issues. Will your driv- 
er be in assembly, C, or C++? If it needs 
to operate on more than one platform, 


#include <stdio.h> 


how will the platform-dependent versus 
the platform-independent code be struc- 
tured in terms of files? How will different 
versions of your driver be built? Will they 
come from the same source code or will 
new directories be created for feature re- 
visions of your driver? What happens when 
the hardware itself is revised? Will you use 
conditional compiling to support differences 
in your hardware? Or will you create a new 
project in your version-control database? Is 
some source code shared with another driv- 
er or is an API shared with an application 
that talks to your driver? Where will this 
common source code be placed so that it 
is accessible to all the modules that use it? 

Such questions let you map out a strat- 
egy for implementing your design. If these 
questions are not addressed at the begin- 
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ning of the implementation cycle and you 
leave each individual team members to 
figure things out individually, it’s like start- 
ing a small fire in a room full of explo- 
sives. Before you know it, one explosion 
will lead to another and the entire room 
will be such a mess that you’ll have to 
start all over again. 

The answers to these questions are also 
critical in helping you design a directory 
layout for your source code. Once you 
have a directory layout designed and doc- 
umented, I suggest creating a project struc- 
ture in your version-control database that 
maps to your directory layout. In other 
words, each directory on your hard disk 
becomes a project (or subproject) in your 
version-control database. This simple one- 
to-one mapping between directories and 
version-control projects lets you structure 
source so that it is optimal for expanding, 
maintaining, and building. It also creates 
a paradigm for version control (based on 
directories) that your team will find easy 
to understand and work with. 


Finding a Home for Shared Code 
One of the key aspects of designing a di- 
rectory layout is visualizing all the modules 
in your project and understanding their po- 
sition in the software hierarchy. You should 
also spend time figuring out their relation- 
ship to other modules. How the source 
code changes in the future is something 
you can’t predict, but even a partial pre- 
diction will help you structure directories. 
Once you identify bare bones relationships 
between your modules, you need to des- 
ignate directories that hold source code 
shared between several modules. Header 
files are typical candidates for this type of 
code sharing, but there may be entire mod- 
ules whose source needs to be shared, such 
as a debug library or a utility class. 
Figure 3 is a hypothetical directory 
structure starting from a directory called 
“DevRoot,” which is assumed to be at 
level 0. The level of each directory is en- 
closed in parenthesis next to its name. 
The directory named “Com” refers to 
“Common.” The root directory is always 
designated to be at level 0. I assign the 
level n to a directory that is 7 directories 
below the root. A directory at level 2 is 
said to be lower than a directory at lev- 
el 1. You always add shared source code 
under a directory called Common. Next, 
propose some rules regarding the direc- 
tory placement of modules that need ac- 
cess to this shared code. Modules can 
share code in a Common directory at a 
level no lower than 7, given that the mod- 
ules themselves are in a directory at lev- 
el n. That is, no module at level can 
include a header file or source file, nor 
link to a library that exists in a directory 
at a level n+1. To enjoy the benefits of 
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(continued from page 102) 

this rule, you combine it with another 
one. The Common directory that a mod- 
ule may share code from must be a sib- 
ling directory of a direct ancestor. In oth- 
er words, when looking for code shared 
by a module, you can only traverse up- 
wards from that directory in your search 
for directories called Common. 

The higher the level of a Common di- 
rectory, the greater its potential for af- 
fecting other modules. In Figure 3, the 
source in the directories under Com (3) 
can be shared by the modules Cm1 (3), 
Cm1 (4), and Cm2 (4). However, module 
Cm1 (2), Cm2 (2), and Cm3 (2) cannot ac- 
cess the code in this directory. However, 
all modules in this directory hierarchy may 
access code in Inc (2) and DebugLib (2). 

The advantage of these rules is that 
relationships between modules that share 
code are easy to maintain. Why all this 
interest in shared code? By its nature, 
shared code is of special interest to ev- 
eryone in the project. It represents an 
area of productivity you can really lever- 
age, but also harbors a high amount of 
risk when modified. 

By simply browsing the directory struc- 
ture, you can determine whether code is 
shared (by looking for the directory Com- 
mon) and which modules are likely to 
share it (by looking for modules at the 
same level or below under sibling direc- 
tories of Common). One quick and tempt- 
ing way to follow this rule would be to 
always create a single, universal Common 
directory at the top-most level and place 
all shared code there. However, this ap- 
proach has several disadvantages that can 
create ongoing maintenance problems. 
For large projects in which several mod- 
ules at different levels may share code, 
this universal Common directory may be- 
come populated with a large number of 
files. It is impossible to track which mod- 
ules share what code in this universal 
Common. By providing such a global 
repository for shared code, you also of- 
fer developers a simplistic way to share 
code. But isn’t making code easy to share 
a good thing? Not necessarily. Sharing 
code between modules is a decision that 
needs to be made with careful thought. 

Say you are managing the source code 
for a project. You follow the rules and 
share some common code at directory lev- 
el 20. At this level, you have three mod- 
ules and so you know that at least these 
modules could include the shared code. 
Furthermore, say a module at level 15 
needs to use this shared code. This vio- 
lates the rule that states that the shared code 
must exist at a directory level equal to or 
higher than the one in which the module 
exists. The solution would be to promote 
the shared code to a sibling directory at 
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level 15. You would now have a Common 
directory at level 15 for shared code. You 
also know that all modules at level 15 in 
sibling directories may share this code. 
Such a promotion should set some alarm 
bells off in your mind. When making such 
decisions during feature development, or 
even during bug fixing, you can ascertain 
the level of risk by looking at the level of 
the Common directory and counting the 
number of potential modules that could 
be affected. 

You should have a good idea of the 
benefits of the twin rules by now. Fol- 
lowing them lets you organize shared code 
in a tight, logical way. In contrast, con- 
sider shared code that is sprinkled across 
your directory hierarchy without any uni- 
formity of location. A module in a top- 
level directory could be sharing source 
code from a directory buried somewhere 
in your directory tree. Sounds like some- 
thing you don’t want to deal with, right? 


Adding Files to Version Control: 

The Great Debate 

I’ve almost never worked with a team 
where the issue of what type of files to 
add to the version-control database hasn’t 
been debated. Obviously, adding a 
source file is never an issue. Things heat 
up when you discuss files like documents, 
schematics, object modules, and exe- 
cutable files. Documentation should po- 
tentially go into a separate Documents 
database. If you don’t have one, create one. 
There are plenty of document-management 
packages available. Most word proces- 
sors or document producing software also 
let you implement some form of ver- 
sioning. Schematics are like hardware 
source code, so an argument can be 
made that those need to go into version 
control. 

The temptation of adding objects, ex- 
ecutables, and temporary files comes 
from the fact that you can take a full build 
of your source code, add all the execut- 
ables and intermediate files to version 








control, and have it serve as a checkpoint. 
This is misuse of a version-control sys- 
tem. There are too many intermediate 
and output files produced by compilers 
and other tools these days, and making 
sure all of them have been added and 
updated from one checkpoint build to 
another is a logistical nightmare. Second, 
it grows your version-control database 
significantly. Remember that your data- 
base must be able to stand up to the rig- 
ors of constant check-ins and check-outs. 
Don’t stress the version-control database 
any more than you have to. 

How do you maintain those build 
checkpoints? By setting up a reproducible 
automated build. The basic idea is that 
you should be able to check-out the en- 
tire source at a given checkpoint and pro- 
duce the same build made from the same 
source at an earlier date. 

The rule is this: If there are any files 
generated from the build process, don’t 
add them to version control. This includes 
translated files such as headers for differ- 
ent languages or platforms, which are gen- 
erated from a master header file. If you 
keep the version-control database lean 
and clean, maintenance requires signifi- 
cantly less effort, and developers will thank 
you when the refresh doesn’t force them 
to take a coffee break. 


Where do Tools Go? 

Anywhere but Here! 

One interesting debate I participated in 
came from a suggestion that we should add 
all our tools to version control. This has the 
benefit of not having to install all the tools 
from scratch when someone needs to start 
working on the code. You simply do a re- 
fresh and get everything— the source code 
and all the tools you need to edit and build 
code. Although interesting, this approach 
is flawed in several respects. 

For one, compilers and their IDEs (think 
Visual C++ 6.0 here) are becoming in- 
creasingly complex. Which of the several 
hundred files will you add to version 


iy 





Figure 1: State diagram for file under version control. 
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control? How will you add the files need- 
ing to go into a specific installation di- 
rectory and how will you add those need- 
ing to go into your operating-system 
directory? When you install a modern com- 
piler and its IDE, the installer may create 
registry entries. Who will make these en- 
tries when you refresh the compiler from 
the version-control database? 





Say a new version of your compiler ar- 
rives at your doorstep and, after running 
some compatibility tests and agreeing that 
everyone on your team should switch to 
it, you face the task of adding this new 
version to the version-control database. 
Any volunteers for this task? 

Finally, depending on the tool vendor 
and your license agreement, adding the 
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(continued from page 106) 

for your project. Often, the most common 
error made by developers is forgetting to 
refresh and lock a file before they start 
working on it. When it comes time to 
check-in the file, the developer (I'll call 
him X) realizes that the file is not locked 
in his name. Two things can now happen 
and they both depend on the features of 
your version-control software. 

In one case, Developer X immediate- 
ly locks the file in his name and checks 
in his version. This violates the state 
change diagram in Figure 1. According 
to this operation, the lock must be per- 
formed before the modifications to the 
file are made. This is enforced because 
in the time Developer X was modifying 
that file, Developer Y could have made 
changes to it. Developer Y checks in his 
changes, but Developer X never sees 
them because he had done a refresh be- 
fore Developer Y checked in his changes. 
Now when Developer X checks in his 
version of the file, Developer Y’s changes 
are lost. Needless to say, this causes 
heartburn when the feature or bug fix 
added by Developer Y stops working 
and he has to spend a day or two find- 
ing out that Developer X (and his slop- 
piness) are responsible. 

In another case, your version- control 
software may play “good Samaritan” and 
refresh the file automatically when De- 
veloper X attempts to check it out. Hope- 
fully it warns the developer that this is 
about to happen because if it doesn’t (or 
if Developer X chooses to ignore the warn- 
ing), he will lose all those changes he was 
working on. 

The administrator needs to establish a 
pattern of usage whenever such things 
start happening (hopefully he will do it 
before this happens). Make team mem- 
bers lock any file that they plan to make 
changes to. Make sure they unlock it or 
check back their changes as‘soon as they 
are done. 

Some developers like to run experiments 
that require changes to files that they don’t 
necessarily want to lock— a scenario that 


can turn ugly. If a developer makes 


Figure 3: Directory levels and sharing. 


changes to a series of files that work but 
loses track of those changes, he will have 
to check-in all those files without really 
knowing exactly what it is he has added. 
He may even forget to check-in a file that 
had changes made to it (if he didn’t check 
it out, there is nothing to remind him). 
Typically, this will result in a bug fix or 
feature addition that works fine on the de- 
veloper’s workstation but cannot be re- 
produced in a build made on another ma- 
chine. On the other hand, the experiment 
may fail and the user may have locked 
the files for naught. What’s a poor devel- 
oper to do? 

My recommendation is to lock all the 
files you need when you make your 
changes on an experimental basis. Mark 
all of your changes with a unique com- 
ment. When it comes time to check-in 
your changes, search through all your 
source files, identify the changes, and de- 
cide if they should be kept. Remove com- 
ments if you don’t want them in there. If 
you decide not to check-in changes, sim- 
ply revoke your lock on those files. Re- 
fresh when you do this so that you are 
not carrying around your experimental 
changes in your build. 

If you don’t feel comfortable locking all 
the files (maybe someone else has them 
locked and your experiment will be a 
quick one), make detailed notes on where 
and how to add changes. When you are 
done and decide to check-in changes, start 
from scratch. Refresh your source in such 
a way that you lose all changes, lock the 
relevant files, then add in changes from 
your notes. 


Broken Builds 

Another common error to watch out for 
involves checking in multiple files with 
changes. Say a developer adds a new con- 
stant in several of his source files. He adds 
the definition for this constant in a head- 
er file in a shared common directory. Be- 
cause the definition is added just once, it 
is easy to forget about it. When it comes 
time to check-in the files, the developer 
might check-in all the source files but ne- 
ssocciha to ne in ine header ile. Ts will 
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result in a compile-time error and a bro- 
ken build. Fortunately, the generated er- 
ror is easy to track down. Track down the 
offending file and check to see who was 
the last person to make changes to that 
file. This person will more often than not 
be the offending party. 

Broken builds should be taken very se- 
riously. More than bugs in the code, a bro- 
ken build brings development for the team 
to a halt— especially if developers are in 
the most excellent habit of refreshing their 
files frequently. Teams have different ways 
of punishing offenders when it comes to 
breaking the build. We once had a donut 
rule in effect. Whoever broke the build 
would have to bring in donuts the next 
day. To preserve the general health of our 
team, we relaxed the rule so that one 
could substitute bagels instead. The mes- 
sage was clear, “Donut break the build!” 


Contention for Files 

Here’s a common scenario: A large num- 
ber of developers are working on the 
source code and have locks on a lot of 
files. At some point, a developer needs a 
file checked out to someone else. Some 
developers avoid the confrontation by 
making local changes to a file and wait- 
ing for the original developer to unlock 
the file at a later date. This saves time only 
in the short run. The best thing to do is 
encourage developers to let others know 
you want the file that is locked. Two de- 
velopers who contend for a file should 
be able to work out a solution where both 
have access to the file when they need 
it. If the debate is heading nowhere, step 
in and resolve it as an administrator. This 
sort of communication within your team 
is helpful. It encourages developers to fos- 
ter relations with other team members 
and potentially lets developers discuss 
planned changes to code they are work- 
ing on. If you see a situation where de- 
velopers are constantly in contention for 
a particular file, you should take a clos- 
er look at that file. It is probably a good 
candidate for splitting it up into a num- 
ber of smaller files. Provided you divvy 
it up according to areas of functionality, 
you will significantly alleviate the con- 
tention problem. 


Fumigating the Hidden BugLord 

There is a nasty “BugLord” lurking in 
your source code that periodically cre- 
ates bugs and makes life miserable. Say 
developers are working on some critical 
features. If your project is anywhere near 
normal, these features will interact with 
each other. Now imagine a developer 
furiously working on a feature that has 
not been checked in for about a month. 
In other words, he is working with an 
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older version of the source code and no 
other developer on the team has worked 
with his changes. Do the words “inte- 
gration problems” come to mind? 

This is the hidden problem that most 
project managers fail to account for and 
some administrators never recognize. You 
can effectively use version control to ad- 
dress this problem. Developers need to 
check-in working changes as soon as 
they are done. Holding on to something 
that is in good working order is just not 
productive for the rest of the team. Of- 
ten developers and project managers shy 
away from checking-in changes for fear 
of “destabilizing” the build. This works 
in the short run, but the time you will 
spend integrating at a later stage will 
more than make up for any short-term 
savings. To alleviate destabilization, dis- 
cuss when a major check-in is expected 
of team members and insist they carry it 
out as soon as they are ready— even if 
there is an immediate penalty to pay in 
terms of integration time. Integration car- 
ried out in smaller steps is beneficial be- 
cause it helps you track the status of the 
project more accurately. With this ap- 
proach, surprises don’t really have a 
chance to become surprises if detected 
early on. Over a period of time, team 
members will become better (and con- 
sequently more aggressive) at integrat- 
ing smaller pieces. As a result, a project 
lead will become better at predicting the 
general health of the source code, re- 
sulting in more accurate schedule pre- 
dictions. 


The Integrate and Test Phase 

You may find it helpful to schedule at least 
two major “integrate and test” phases in 
the life of your project. Depending on how 
much code you plan to produce, and the 
overall length of the project, you may as- 
sign sufficient time to this task to account 
for the entire team working on nothing 
but integrating all changes and producing 
a working build. Once again, you'll be in 
a position to track a project much more 
accurately this way. 

Enforcing a rule at this point is helpful: 
No one should check-in changes unless 
they have tested the build to see that it 
compiles and works reasonably well. 
Granted “works reasonably well” is a high- 
ly subjective term, but you should be able 
to decide with your team what this term 
means. Maybe you have a regression test 
suite that you’d like developers to run be- 
fore checking-in their changes. Alternate- 
ly, you may want to create a manual test 
script to be executed by developers to ver- 
ify that the build is in working order. If 
team members have been refreshing 
source code often enough, they will have 
relatively fewer problems during this unit 
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test phase. In any case, they need to make 
sure they refresh the entire source code 
(other than the files locked by them) at 
least once before creating the build for 
unit testing. 


When and Where to Branch Code 
Every project linked to an ongoing prod- 
uct release must square up to the prickly 
topic of branching code. Say you are 
working on Version 1.0 of a product that 
is now in beta. Bug fixes need to be made 
but you are winding down on the release. 
Meanwhile, a team assembled to work on 
Version 2.0 needs to forge ahead. Although 
they need to use your code base, you 
don’t want their features to be added to 
the version you are working on. Yet you 
want to make sure the 2.0 team gets all 
the bug fixes you plan to make for Ver- 
sion 1.0. 

A technique most teams employ in such 
situations is to create a branch in the 
source code. By branching the code, you 
are creating two version streams. One on- 
going stream, which I'll call the “current 
stream,” is used to make bug fixes for the 
current version. The other stream, the 
“new stream,” is handed over to the team 
working on the next version for feature 
additions. 

When bug fixes are completed and the 
current version is released, developers mi- 
grate to the new stream. An alternate ap- 
proach to this is to have developers work- 
ing on the current stream migrate their 
changes to the new stream simultaneous- 
ly. This might take away precious time 
from the current development effort. If you 
have to migrate the changes from the cur- 
rent stream to the new stream at a later 
date, make sure these changes are well 
documented. If possible, assign the same 
developer to make changes to both 
streams. This developer is more likely to 
be able to read the documentation and re- 
call the steps required to migrate the 
changes to the new stream. 

In any case, the decision to branch must 
be made after a careful evaluation of the 
status of your current project, the esti- 
mated schedule, and the resources and 
timeline for the next version of the prod- 
uct. Only administrators should have the 
authority to branch source code. When 
the merge to the new stream is complete, 
administrators can choose to delete the 
old stream. Alternately, you can keep it 
around for reference but should archive 
and delete it as soon as the next iteration 
of the product reaches stability. This will 
help you keep your database trim. 

As an offshoot of this rule, only ad- 
ministrators should have the rights to per- 
manently delete a file from version con- 
trol. Giving developers this kind of 
permission can result in disaster. 
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Labels: Adding 

Checkpoints to your Source Code 

Most major version-control software pack- 
ages let you create labels that span across 
the entire database and identify the cur- 
rent version of a file when the label was 
assigned. You can then retrieve the cur- 
rent version of each file when the label 
was assigned by retrieving files associat- 
ed with the label. Sounds simple enough. 


A good 
version-control 
administrator 
constantly monitors 
the usage patterns of 
developers 





Assigning and cataloguing labels is im- 
portant. If used injudiciously, labels can 
be difficult to decipher and lose their ben- 
efit quickly. 

Devise a standard format for creating 
labels that will be easy to read and deci- 
pher. If a label affects a major part of the 
project, assign it across all files in your 
project. The administrator is the only per- 
son who should be assigning such global 
labels. If multiple projects are assigning a 
label to the same files, you should attach a 
prefix to each label that identifies the pro- 
ject for which the checkpoint is being 
created. An example format for such a 
label would be PROJECTNAME_VER- 
SION_PURPOSE. 

What about private labels— those that 
developers want to assign privately to a 
their portion of the code as a personal 
checkpoint? This can be a valuable tool 
when it comes to marking critical stages 
in the development of a module. Devel- 
opers can usually assign a label provided 
it is meaningfully worded. This, unfortu- 
nately, is at the discretion of developers. 
However, you can insist on a couple of 
rules. A private label cannot be assigned 
to all files in the project. Only adminis- 
trators can do that. Second, a private la- 
bel must be prefixed with the developer’s 
username or initials. The first rule will pre- 
vent global labels from growing to an un- 
manageable volume. The second rule gives 





you a rudimentary journal capability. If 
you ever have a question about a partic- 
ular label, or if it sounds cryptic to you 
and you want developers to change it, you 
have an easy way to figure out whom to 
talk to. 


Version-Control Inspections 

The best way to ensure that everyone is 
following essential rules and guidelines is 
to make frequent version-control inspec- 
tions. These would be similar to code in- 
spections, but you will spend most of your 
time looking at how the code is organized 
rather than how it is written. More specif- 
ically, where are new directories being 
added in the hierarchy, how is code be- 
ing shared, and are files being named 
meaningfully? Administrators have the best 
bird’s-eye view of the project and its 
source code. Watch out for generic file 
names. What, for example, does a file 
called “status.c” contain? Make sure de- 
velopers use filenames that are specific 
enough so that you don’t need to open 
up the file and look at its contents to fig- 
ure out what is in it. Rename files like “sta- 
tus.c” to “FileStatus.c” if it clarifies the con- 
tents of the file. 

Although code inspections are separate 
exercises with different goals, they can be 
helpful in picking out potential code or- 
ganization problems. You'll almost always 
come across some constants that should 
have gone in a header file but were added 
to a source file instead. Convince the 
code-inspection team that such problems 
should draw their attention and be rec- 
ommended for correction. 


Conclusion 

Version control is an often-neglected ac- 
tivity in team-based software development. 
Its correct and smooth functioning ensures 
that projects won't have major hiccups. It 
is difficult to document the impact of prob- 
lems that come from incorrect or sloppy 
use of the version-control system. But, by 
allocating special attention to this aspect 
of development, you can avoid potential- 
ly hazardous problems that affect the fi- 
nal outcome of your project. A good way 
to start is by designing the implementa- 
tion of your source code at a coarse lev- 
el, then lay down rules and guidelines 
regarding how to use version control. 
The best rules and guidelines are those 
that evolve from within a team, so re- 
member to bring everyone together and 
come up with a set of rules by consen- 
sus. If you get buy-in from the team, 
you'll have enthusiastic developers work- 
ing with version control and leveraging 
it to its fullest extent. 
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Doorknob Arguments 


Al Stevens 


n January of this year Judy and I visit- 

ed our daughter and her family in Vir- 

ginia. Wendy lives there with her hus- 

band, Lester, and two sons, Landon and 
Woody. The occasion was Landon’s sev- 
enth birthday. To celebrate the milestone, 
Landon was permitted to host a Friday 
night slumber party with several of his 
friends, mostly small boys his own age, 
and, of course, Woody, who is four. The 
boys romped in the downstairs recreation 
room while we adults visited upstairs. 
Wendy had allowed that just this once the 
boys could carry on all night until they 
dropped one by one from sheer exhaus- 
tion. A wise decision since no other out- 
come would have been remotely feasible. 
After Wendy and Lester and Judy went to 
bed, I stayed up for a while in the living 
room, reading a bit, but mostly listening 
to the boisterous sounds coming from 
downstairs. Every now and then, amidst 
the endless chatter and giggles, came a 
chorus of shouts in unison followed by 
gales of laughter. The shouts were always 
the same word, “Doorknob!” I wondered 
what it meant. 

After a while, the celebration died down. 
I slunk down the stairs, peeked in on them, 
and found the boys all asleep on the floor 
scattered around the room among piles of 
pillows and blankets and toys. Still not 
knowing the significance of “doorknob,” I 
called it a night and turned in. 

The next morning found parents drop- 
ping in for coffee and to pick up their 
young charges. I quietly asked each of the 
parents about “doorknob,” but nobody 
had a clue. When all the small guests had 
departed, I waited for Landon to come up 
for breakfast. He was late, sleeping in af- 
ter his night of celebration and merriment. 
At last we were alone at the kitchen’s 
breakfast bar eating our cereal, and I asked 
him what it means when someone says, 
“doorknob.” 


Al is a DDJ contributing editor. He can be 
contacted at astevens@ddj.com. 
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“It means you didn’t do something bad,” 
he answered cautiously. 

“Something bad like what?” I asked. Lan- 
don and I have no secrets from one an- 
other and can speak openly at all times. 

“It means that you didn’t...” Landon 
leaned closer, looked around to make sure 
that no one else was listening, and in a 
conspiratorial whisper said the word, the 
noun and verb, that describes a particu- 
lar human action that is generally not ac- 
ceptable behavior in polite society but that 
small boys, and big ones too, find to be 
hilarious for some reason. Having said the 
forbidden word, he stifled a giggle and 
continued, “Whenever a guy leaves one, 
everybody else says ‘doorknob’ so that 
people will know who didn’t leave it.” 

I struggled to maintain a modicum of 
composure. It wasn’t easy. “How did you 
learn that?” I managed to ask. 

“Cody told us,” he said, referring to his 
older cousin. “He’s 10 and knows about 
stuff like that. When somebody leaves one, 
the one who doesn’t say ‘doorknob’ is the 
one who left it, and that’s how you know 
who to blame.” 

I excused myself and hastily left the room. 

A couple of weeks have passed, and I 
have had time to reflect on the significance 
of “doorknob.” How simple and intuitive. 
Leave it to a bunch of little boys to coin 
a language element that solves a problem 
as old as mankind itself. When something 
bad is done, whomever did not do that 
something bad says “doorknob,” and ev- 
eryone knows. Now it’s up to us adults to 
extend the solution to other appropriate 
idioms. I'll have to think about this for a 
while. 


What's in an argv? 

The April 1999 issue of DDJ included an 
article by Brian Kernighan and Rob Pike 
about parsing regular expressions. To 
demonstrate the technique, the authors 
included an example grep program. While 
editing the article and its code for techni- 
cal content, I found code similar to that 





shown in Example 1. Accompanying the 
text were examples of command lines that 
invoke the program like this: 


grep sometext *.txt 


My initial reaction was that the code 
does not work. I was only partially right. 
How right I was depends on how many 
readers develop under MS-DOS and how 
many develop under UNIX. That ratio di- 
rectly correlates to the degree to which I 
was right. Okay, so maybe I was mostly 
wrong, but you don’t expect me to admit 
it without a struggle, do you? 

The problem, as I saw it, was that the 
line of code that calls the Standard C fopen 
function passes a pointer to a string with 
the value “*.txt,” which is not a file spec- 
ification that fopen recognizes. MS-DOS 
programmers will immediately understand 
this problem. UNIX programmers will not 
see any problem at all. 

I called Brian Kernighan and asked him 
about it. After all, the “K” in K&R ought 
to know how to write C code. I was sure 
I'd found an oversight and that he would 
resoundingly thank me. Brian did recog- 
nize the problem right off, which was not 
that the code was wrong but that he was 
talking to an MS-DOS programmer. He pa- 
tiently explained that the UNIX command 
processor shell expands ambiguous file 
specifications into a list of filenames that 
the shell passes to the program in the argv 
array. The MS-DOS COMMAND.COM 
command processor makes no such ex- 
pansion and passes to the program what- 
ever the user enters on the command line. 

Just to be sure, I compiled a program 
like the one in Example 1 at the UNIX site 
where I develop CGI programs. I put a printf 
into the program to display each of the argu 
arguments on the console. Sure enough, 
when I ran the program with “*.c” as the 
command-line argument, the program dis- 
played all the C source-code filenames in 
the current directory. Not that I didn’t be- 
lieve what Brian Kernighan said, you un- 
derstand. Just had to see it for myself. 
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This dialog between two international- 
ly famous C gurus (me, your humble yet 
revered “C Programming” columnist, and 
Brian, a genuine C authority who actual- 
ly deserves recognition) raised two ques- 
tions. First, if I'm such a hot shot guru, 
why didn’t I know what UNIX shells do 
with ambiguous file specifications? Sec- 
ond, why doesn’t COMMAND.COM, now 
the world’s most widely used command 
processor, expand them like the Bourne 
shell and others do? 

Let’s start with my excuses. My C pro- 
gramming began years ago with Leor Zol- 
man’s BDS C compiler for CP/M and con- 
tinued with the Aztec C compiler on that 
platform, which implemented classic K&R 
C. Later I used most of the C compilers, 
K&R and Standard C, that were imple- 
mented for the PC MS-DOS platform, and, 
until GUI processing became the preferred 
way to write software for the PC, command- 
line processing was a major part of that ex- 
perience. If the user was to enter am- 
‘ biguous file specifications with wild cards 
on the command line, you had to include 
a function to parse them into lists of un- 
ambiguous file specifications. That re- 
quirement was a given, and programmers 
wrote and published general-purpose 
command-line option parsers and ex- 
panders. I wrote one, too, and reused it 
many times. 

I thought I understood the command 
line inside and out. But during all this 
time, I wrote an occasional UNIX pro- 
gram, too. How come I never knew about 
file-specification expansion by the shell? 
The answer is I don’t really know why ex- 
cept that I taught myself everything I know 
about programming on both platforms, 
and somewhere along the way the teach- 
er let the student down. To offer a lame 
excuse, I will explain that none of my 
UNIX programs (that I remember) used 
the command line for file specifications. 
Those programs were mostly to support 
database engines and, more recently, CGI 
applications. If I would have needed file- 
name expansion in one of those programs, 
I probably would have included the ex- 
pander function I mentioned earlier. It 


#include <stdio.h> 
int main(int argc, char *argv[])' 


{ 
f = fopen(argvli], "r"); 
af (Ff Ve NOG) et 
/* process f */ 
fclose(f); 


} 


return @; 


Example 1: A program with *argvl |. 
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wouldn’t have harmed the program, but 
it wouldn’t have done any good either. 
The expander function would simply nev- 
er have seen a command-line option with 
a wild card character to expand. 

This new, yet old, piece of knowledge 
answers one of my questions about the 
<stdio.h> part of the C Standard. Why 
aren't there standard functions such as the 
typical findfirst and findnext that most PC 
compilers define in a <dir.h> header file? 
Now I know. 


My C programming 
began years ago 
with Leor Zolman’s 
BDS C compiler 
for CP/M 





Now, let’s try to answer why COM- 
MAND.COM does not similarly disam- 
biguate file specifications on the command 
line. Brian observed that this problem was 
solved and the solution defined by the 
UNIX developers about 30 years ago. It 
does not seem reasonable that the framers 
of MS-DOS would not have taken their 
example from those who pioneered the 
technology and paved the way for the op- 
erating systems to come. 

First some background. DOS, a 16-bit 
operating system designed to run on 8086/ 
8088 microcomputers, was originally a close 
clone of the CP/M operating system that 
ran on 8080 and Z80 microcomputers. Its 
development was made necessary because 
CP/M-86 was late being released and its au- 
thor needed something right away for an 
8086 platform his company was building. 
Later, Microsoft acquired DOS, renamed it 
MS-DOS, and persuaded IBM to use it on 
its newly introduced PC in 1981. After a few 
upgrades, MS-DOS looked much like it does 
today. All of which is ancient folklore for 
the historians to muddle over. 

The MS-DOS operating environment for 
running programs was constrained by the 
memory limitations of the PC platform and 
the fact that MS-DOS is a single-tasking 
operating system with no task swapping. 
(The inherent paucity of the operating 
system when paired with the requirements 
of contemporary applications later gave 


rise so such kludges as DOS extenders, 
terminate and stay resident programs, ex- 
tended memory, expanded memory, high 
memory, and so on.) As a result of such 
operating system limitations, the COM- 
MAND.COM command processor is di- 
vided into resident and transient parts. 
The resident part contains only the code 
that the operating system needs to break 
into the running program, to enable the 
running program to terminate, and to 
reload the transient part when the run- 
ning program terminates. The text that 
the user types on the command line is 
initially stored in the transient memory 
and then copied to the running program’s 
Program Segment Prefix (PSP), which, 
among other things, contains a 128-byte 
so-called “command tail” data space to 
contain the command-line data. (This ap- 
proach allows a running program to use 
one of the nonstandard exec— or 
spawn— functions or the standard sys- 
tem function to launch subprocesses with 
different command lines. But that’s hind- 
sight; it probably wasn’t a design objec- 
tive.) 

Obviously the command tail’s 128-byte 
limit is not big enough to support unlim- 
ited filename expansion. If the develop- 
ers of MS-DOS were trying to emulate the 
UNIX operating environment, they might 
have chosen a different way to implement 
the command-line expansion to enable 
variable length filename lists. They might 
have allocated memory for the arguments 
from the system heap, for example. But 
they did not. Why not? 

We sometimes forget that MS-DOS was 
not developed to be an operating system 
to support a UNIX style of programming 
in the C language. When I saw my very 
first IBM PC in a computer store in 1981, 
there was no compiled language avail- 
able —C or otherwise. You could code in 
interpreted Basic or in ASM. That was it. 
(There was talk of an alternative Pascal 
operating environment, but that concept 
never saw significant light of day.) The 
developers of MS-DOS were trying to get 
something working in the limited space 
provided (the first PCs had 64 K of RAM) 
with the odd memory architecture that 
IBM had contrived for their PC. It would 
be up to the C compiler builders to take 
the platform and make something mean- 
ingful out of it. 

This is where, I think, we had a divi- 
sion of responsibility with a big gap at the 
division. The first C compilers on the PC 
were ports of UNIX compilers. Those folks 
were accustomed to having the shell take 
care of command lines. COMMAND.COM 
and the 128-byte command tail in the PSP 
did not— and could not— support what 
the UNIX shells could do. The C compil- 
er builders had to take over responsibility 
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equations. Selected for the value of 
their content by the editors of DDj, 
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(continued from page 116) 

for the command line, which was hand- 
ed to them in a single-byte string just as 
the user types it in. A C program expects 
argv to point to an array of pointers to 
NULL-terminated strings, with each string 
being one entry on the command line. 
COMMAND.COM doesn’t do that, so the 
compiler builders had to implement it in 
the run-time startup code. And they chose 
not to expand file specifications. And what 


The command tail’s 
128-byte limit is not 
big enough to 
support unlimited 
filename expansion 





we do not have now is what they chose 
not to provide. 

But, you ask, what about now? Aren’t 
the MS-DOS component of Windows 98 
and its NT and OS/2 counterparts all big 
time, grownup 32-bit operating systems? 
Don’t they have the resources available to 
support real shells a la UNIX? Sure they are 
and sure they do, and such shells are in- 
deed available; they’re just not a part of 
the distributed operating systems. And why 
not? I can only guess that the purveyors of 
those operating systems judge that today’s 
developers target (or should target) GUI 
applications exclusively, that command-line 
options are old hat, that everything today 
is done with dialog boxes anyway, so why 
bother? They have decided that the 30 year 
old legacy of UNIX is not contemporary 
enough for the developers of the New Mil- 
lennium and need not be paid the respect 
us Oldtimers think it deserves. 

And so we unearth all the culprits. The 
MS-DOS developers are not completely at 
fault, although they share some of the 
blame. Sure, they chose not to write a shell 
that handled filename expansion, but the 
compiler builders made a similar decision 
when they were implementing the argc, 
argv logic in their startup code. And all 
those developers and pathfinders agreed 
that those decisions were adequate for the 
rest of us. What can I say? Oh, yeah... 

“Doorknob!” 


DDJ 
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sing a Relational Database Management System 
U Object Orientation (OO) programming 
can severely undermine the benefits of OO. The costs 
of mixing Object-to-relational programming are high 
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ming, such as flexibility, reuse and 
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You don’t have to be a master mathematician to increase your 
leverage. Using an Object Database can reduce your code by 30%. 


your code by 30%. To learn more about how 
Objectivity’s Object Database can help you exploit 
the object-related benefits in your application and 
increase your leverage, contact us for a free copy of our 
white paper, Accelerating Your Object Oriented 


Development by visiting our website: 


or calling (800) 767-6259. 
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JAVA Q&A 





How Do You Run 
Untrusted Classes? 


Lou Grinzo 


recently worked on a Java project that 

raised a complex question: How do you 

run classes written by unknown people 

as part of your Java 1.1 application, with- 
out putting your system (or your sanity) 
at risk? Since this project involved a pro- 
gramming contest open to the public, 
there was every likelihood I’d encounter 
code written by people with dishonorable 
intentions. Furthermore, my code would 
have to provide a set of methods that the 
untrusted classes can call for services 
unique to the game. This meant that dur- 
ing the execution of my program, control 
would bounce back and forth between 
trusted and untrusted classes. Clearly, I 
needed a way to put suspect code in the 
tightest possible security box. Two things 
quickly became apparent: First, that I 
needed to use a customized Security- 
Manager, and second, that even using that 
mechanism wouldn’t meet all my needs. 

Java’s SecurityManager architecture is 
like many other parts of the language — 
it’s streamlined yet surprisingly power- 
ful. The basic idea is that whenever any 
code tries to do certain interesting things, 
like read from or write to disk files, Java 
calls a method in the currently installed 
SecurityManager particular to that action 
to see if it should be allowed. If the 
method determines that the action is safe, 
it simply does nothing and returns; oth- 
erwise, it throws a SecurityException. Se- 
curityManager has a few dozen methods, 
tailored to different tasks that can be se- 
curity checked. (This is the basic mech- 


Lou is a freelance programmer and writ- 


er. He can be contacted at 71055.1240@ 
compuserve.com. 
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anism used to create the “sandbox” that 
Java uses to control applets.) When you’re 
running an application, however, by de- 
fault there is no SecurityManager installed, 
so all activities are allowed and the en- 
tire system is at the mercy of the pro- 
gram’s whims, subject to whatever re- 
strictions the operating system places on 
any normal program. 

The currently installed security manag- 
er is a JVM-wide entity, and it isn’t applied 
based on which thread or class is initiat- 
ing an operation. If you create and install 
a custom SecurityManager that simply dis- 
allows everything, then you'll likely trig- 


ger unwanted exceptions, so a more so-- 


phisticated approach is needed. Another 
relevant detail is that you can install only 
one SecurityManager. Any attempt to re- 
place or remove a security manager throws 
an exception. While this guarantees your 
custom security manager won’t be over- 
ridden, it also means you’re stuck with it 
for the life of your program. 

Since my code had to be fully trusted 
and have free reign of the system, this 
architecture created a problem. I need- 
ed a way to provide fine-grained secu- 
rity control, based on which class was 
responsible for the operation being se- 
curity checked. My first solution was sim- 
ple, obvious, efficient — and ugly. I cre- 
ated a mutableBoolean class, which 
wrapped a single, publicly accessible 
Boolean variable named enabled, and 
passed this via reference to my security 
manager’s constructor, which stored the 
reference (not the Boolean itself). In each 
of the security manager’s methods, I sim- 
ply check the value of enabled: True 
means strict security is in place, so throw 





an exception, while False means all bets 
are off, let the code do what it wants. 
This worked fine, but it created a code 
maintenance burden, because I had to 
set the Boolean just before passing con- 
trol to the untrusted code, and reset it 
whenever I got control back. This also 
required me to flip the Boolean at the 
entry and exit points of each of those 
callback methods, and make sure that the 
Boolean was in the proper state at all 
times when other, unrelated parts of my 
program got control. As if all that weren't 
bad enough, it also created a slight se- 
curity risk. Listings One through Five pre- 
sent the classes needed to run this switch- 
able security manager, including callback 
support for the untrusted classes. The 
comment in SMdemo (Listing Two) pro- 
vides instructions on running both ver- 
sions of the programs. 


Security, Take 2 

My second attempt at improved security 
proved the old programmer axiom “write 
it twice and throw the first one away” still 
holds true. For my purposes, the most in- 
triguing method in Java’s SecurityManag- 
er is classDepth(), which returns an inte- 
ger telling you where the most recent 
occurrence of any method from a speci- 
fied class is on the call stack. This let me 
create a security manager that used two 
lists of names (one of the trusted classes 
and one of the untrusted classes) and let 
the security manager dynamically check 
the call stack and see which group the 
call ultimately came from. Oddly enough, 
the security manager doesn’t need a list 
of all classes in the program: The trusted 
class list only has to include those classes 


121 








that directly call methods in untrusted code 
(plus those, in the case of callback meth- 
ods, that are directly called by untrusted 
code), and the untrusted list only needs 
the names of classes directly called by 
trusted code. 

Two examples might help illustrate how 
this works. My custom security manager, 






changed a great deal, and becomes 

far more complex in an effort to pro- 
vide programmers and administrators 
much needed finer-grained control over 
security. I can’t begin to do the new se- 
curity features justice here; see http:// 
java.sun.com/products/jdk1.2/docs/guide/ 


|: JDK 1.2, Java’s security model has 


fication. 
One surprise is that the method I 


Manager.classDepth() is deprecated in 
JDK 1.2 beta 4. Sun confirmed that it is 
indeed deprecated, but says it won’t be 
disappearing for quite some time. Sun 
is deprecating this method because it’s 
normally used to find the absolute class 





JDK 1.2 Changes Everything 


security/index.html for the current speci- _ 


based my smartSM class on, Security- 


called “smartSM,” knows about only two 
classes: 77 is trusted, and U7 is not. When 
T1 calls a method in U1, U1 then calls a 
method in U2, which in turn calls a 
method in U3. U3’s method does some- 
thing that requires a security check, so the 
JVM calls the appropriate method in my 
security ianaee When it examines the 






depth of an untrusted class, and this. 
number is then used with a heuristic to 
pass judgment on the code’s actions. For 
example, if the call stack depth of the 
untrusted class is less than two, then the 
operation is disallowed, since the call is 
being made directly by untrusted code. 
If it's greater than two, it’s allowed, with — 
the assumption that it came from trust- 
ed code that was called by the untrust- 
ed codes. See Scott Oaks’s Java Securi- 
ty (OReilly & Associates, 1998, ISBN 
1-56592-403-7) for a more complete de- 
scription of this situation and the prob- — 
lems it implies, as well as the changes. 
to the JDK 1.2 security model. 
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stack, the security manager finds that U1 
is higher on the stack than is 77 (it knows 
nothing of U2, U3, and other classes, so 
it ignores them), and it disallows the op- 
eration. 

Callbacks are handled similarly. Say that 
T1 calls U1, and U7 then invokes a call- 
back method in 77. The 77 method wants 
to write to a disk log file, which brings 
the security manager into the act. It checks 
the stack and finds that T7 is higher on 
the stack (thanks to the callback method 
that UZ called) than is U1, so it lets the 
operation proceed. Again, as long as the 
security manager knows about all classes 
that directly call or are called across the 
trusted/untrusted divide, this technique 
will implement the desired security poli- 
cy, and I don’t have to load down my code 
with all that Boolean flipping. 

Because of the nature of my application, 
I’m imposing a much stricter security mod- 
el than the one Sun and third parties nor- 
mally illustrate in their examples. My tech- 
niques don’t allow system code to pass the 
security checks— even if they’ve been 
called most recently by untrusted code. If 
you can't live with those restrictions, you 
have to use a more complex and fragile 
solution in Java 1.1 (which entails check- 
ing the absolute stack positions of untrusted 
classes, not relative positions, as does my 
code), or use the new security enhance- 
ments in Java 1.2. See the text box “JDK 
1.2 Changes Everything” for details. 

The classes to run the “smart” version 
of the program are in Listing Two (SMde- 
mo) and the files smartSM, trustedSmart, 
and untrusted2 (all three available elec- 
tronically; see “Resource Center,” page 5). 


The return() is in the Mail 

An entirely different issue involves un- 
trusted code that simply never returns con- 
trol. The solution is to make my applica- 
tion create a separate thread for the 
untrusted code, start it, and then use the 
System.sleep(_) method to wait for the max- 
imum time I’m willing to let the untrust- 
ed code run for each invocation. When 
the main thread wakes up, it checks via 
the untrusted thread’s isAlive() method to 
see if it’s still running. If so, then I use the 
System.exit() method to end the program. 
(I would have preferred a less drastic way 
to handle this situation, but 7hread.sus- 
pend) doesn’t always work. In my test- 
ing with the JDK 1.1.6 under Windows 95, 
I found that suspendQ) and stop() wouldn't 
immediately stop the errant thread, and 
wouldn’t stop it at all if it was in an infi- 
nite loop. Also, the stop(), suspend), re- 
sume(), and destroy() methods of Thread 
are all deprecated as of JDK 1.2, so I 
wanted to avoid them. (See http://java.sun 
.com/products/jdk/1.2/docs/guide/misc/ 
threadPrimitiveDeprecation.html for an 
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explanation of this decision.) This is where 
the security exposure I mentioned earlier 
in the switchableSM security manager 
comes into play. If the main thread wakes 
up and finds the untrusted code is still 
running, it has to relax security before it 
can do certain things like write to a log 
file. This can create a slim but real expo- 
sure if the untrusted code manages to try 
something in that sliver of time between 
when the security is disabled and the pro- 
gram is terminated. (A reliable Thread.sus- 
pend) would plug this gap nicely, but 
that’s not available.) 

As it turns out, I still wasn’t done. It 
seems that some undesirable actions were 
still available to the untrusted code, like 
getting the thread for the main program 
and calling its sleep() method. The solu- 
tion was simply to create a new Thread- 


Group for the untrusted thread’s invoca-_ 


tion, and then create the thread in it. 
(ThreadGroups are roughly analogous to 
directories in a file system, in that they 
form a hierarchy of nodes that contain 
threads and other thread groups.) This, 


plus the smart security manager, prevents 
the untrusted code from accessing other 
threads. | 


When you're running 
an application, by 
default there is no 
SecurityManager 

installed 





Conclusion 
While this “smart SecurityManager” tech- 
nique solved my problem and is reason- 





ably easy to maintain, Sun has put pro- 
grammers in a bind. The Java 1.1 secu- 
rity features are simple but limited for 
more general-purpose situations, and 
their use often results in difficult to main- 
tain code because of their reliance on 
absolute call stack depth. The JDK 1.2 
security features are far more compre- 
hensive, but introduce a new level of 
complexity. On top of that, they’re still 
in a state of flux. I’ve tested this code on 
both JDK 1.1.6 and beta 4 of JDK 1.2, 
with identical results. 

I would like to see something similar 
to my smartSM become part of the secu- 
rity architecture in 1.2, possibly as an of- 
ficially supported helper method on top 
of the current architecture. Java needs a 
reliable way for programmers to specify 
classname-based security, without having 
to wrestle with the complexity of the new 
architecture. 


DDJ 





Listing One 


public class mutableBoolean 
{ public boolean enabled; } 


Listing Two 


// SMdemo: SecurityManager demo program. See the comment 


// in main() before running this program. 


import java.lang.*; 
import java.io.*; 
import java.util.*; 


class SMdemo 
{ 
static public void main(String[] args) 
{ 
// Only enable one of the following two lines! 
doSmartTest () ; 
// doSwitchableTest () ; 


} 
PIVIVITITTITTTTATT TTT AAT TTT TTT TT ATTA TTT TTT TATA ATT 


static void doSmartTest() 


{ 


String[] untrusted_class_names = { "untrustedi" }; 
String[] trusted_class_names = { "trustedSmart" }; 


smartSM sm = new smartSM(untrusted_class_names, 
trusted_class_names) ; 


AddDumpLine("SMdemo: About to install custom SM."); 


System. setSecurityManager (sm) ; 


AddDumpLine("SMdemo: Just installed custom SM, about to run test."); 


trustedSmart tSmart = new trustedSmart(); 
tSmart.doTest(); 


AddDumpLine("SMdemo: Just returned from trustedSmart; about to end."); 


} 
FLLTLTLTTTTTTT TTA TTT TTT T TTA TTT TTT TTT TTT TTA TTT TT TATA AAT 


static void doSwitchableTest () 
{ 
AddDumpLine("SMdemo: About to run test."); 


trustedSwitchable tSwitchable = new trustedSwitchable(); 


tSwitchable.doTest(); 


AddDumpLine("SMdemo: Just returned from trustedSwitchable; about to end."); 


} 
PIDITIITITTITITTAT TTI T ATTA T TTT TTT TTT TTT TTT TTT TT TTT TT TT 


static private void AddDumpLine(String s) 


try 
{ 


RandomAccessFile ofile = 


new RandomAccessFile("SMdemo_dump.txt","rw") ; 


ofile.seek(ofile.length()); 
ofile.writeBytes(s) ; 
ofile.writeBytes("\r\n") ; 
ofile.close(); 


catch (IOException e) 
{ 


System.err.println(e) ; 
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return; 
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Caché, the post-relational database, 
gives developers something that 
relational databases can’t. We call it a 
“License to Speed”. 

Caché is so fast we've had customers 
get up to a 20x boost in SQL performance 
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ALGORITHM ALLEY 


Code Tuning in 
Context 


Jon Bentley 


ome people spend hours tweaking 
the very last bit of horsepower out of 
their automobile engines. Others ad- 
just strings until their violins sing 
sweetly, practice golf swings, or finely pol- 
ish web pages. I, too, enjoy tuning: I love 
to tinker with code to make it run faster. 
When I started programming, code tun- 
ing was easy: You found the innermost 
loop, got rid of the expensive parts, and 
your program ran faster. Nowadays, 
though, I’m no longer surprised when I 
tune code and find that it is not substan- 
tially faster, and is sometimes slower. 
What has changed? Lots of things. Com- 
puter architectures have more accelerators, 
and compilers perform more optimizations. 
This column studies code tuning in sever- 
al different contexts. I'll start with a family 
of four related algorithms, apply four tune- 
ups to them, and measure the results on 
various hardware and compilers. 


Background 

Last month’s column presented a sequence 
of four algorithms to solve the Traveling 
Salesman Problem (TSP), and analyzed 
the run times of each. 

Algorithm 1 was the simplest code for 
the job. It enumerated all 7/ permuta- 
tions, and stored the shortest. Since it 
employed n distance calculations to find 
the cost of each tour, it used a total of 
nxn/ distance functions. 

Algorithm 2 was a minor variant of Al- 
gorithm 1. It required fixing a starting city 
by changing one parameter in one call from 
n to (n-1); this reduced the number of 
tours examined from 7/ to (n—1)/, and the 
total run time to mx(7—-1)/ or n/ distance 
functions. 

Algorithm 3 reduced the number of dis- 
tance functions by keeping a partial sum 
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of the tour distance. The total number of 
distance calculations was approximately 
(1+e)x(m—-1)/, where e=2.71828...is the 
base of natural logarithms. 

Algorithm 4 pruned the search by re- 
turning whenever the partial sum exceed- 
ed the length of the smallest tour yet ob- 
served. Experiments showed that it is much 
faster than Algorithm 3, but its run time was 
very dependent on the particular input data. 

These algorithms are fine candidates for 
tuning: Each is a small loop that is likely 
to consume many cycles in an application. 
I'll tune all four in parallel for the sake of 
experiments; in a real application, you 
would tune only the fastest algorithm. 


Tuneup B: Precompute Distances 
The critical operation in TSP algorithms is 
often the distance function: 


Dist geomdist(int i, int j) 
{ return (Dist) (sqrt(sqr(cli].x-clj].x) 
+ sqr(cli].y-clj]-y))); 


A profile showed that Algorithm 1 spent 
16.6 percent of its time in geomdist, 64.6 
percent in sqrt, and 5.7 percent in this sqr 
function: 


float sqr(float x) 
{ return x*x; 


Together, these three functions account 
for about 87 percent of the time used by 
the program. 

Exercise 1. How can you reduce the 
time the program spends in the geomdist 
function? 

You could tune the function in several 
ways: You could write the sgr functions 
inline, eliminate common subexpressions, 
or use a special-purpose square root. I 
will, instead, leave its cost unchanged, but 
almost eliminate its time by precomput- 
ing all n* possible distances and storing 
them in an array: 





for G = 0; i < n; i++) 
for G = 0; j <n; j++) 
distarrlillj] = geomdistt, j); 
This approach trades space for time. The 
distarr table uses n* words of storage, but 
you can now compute the distance from 
point 7 to point 7 with an array access: 


#define dist(i, j) distarrli] [j] 


Exercise 2. How will this tuneup change 
the run times of the various algorithms? 

I'll call the original code for each algo- 
rithm (which calls the function) “Tuneup 
A,” and the modified code (which access- 
es the array) “Tuneup B.” Thus Algorithm 
1A (implemented in tsp2.c; available elec- 
tronically, see “Resource Center,” page 5) 
is the original Algorithm 1, Algorithm 1B 
incorporates the array, and so on for the 
three other algorithms. Table 1 shows be- 
fore and after run times on a 200-MHz Pen- 
tium Pro. Because the later algorithms are 
so much faster than the earlier ones, I had 
to increase the input size 7 on the various 
rows to get accurate run times. In every 
case, though, Tuneup B represents a sub- 
stantial performance increase. Furthermore, 
this is not the kind of speedup supported 
by hardware or system software. 

Lesson: Tuneups beyond the reach of 
hardware accelerators and optimizing 
compilers still play a vital role in software 
performance. 

The rightmost column in Table 1 shows 
the speedup ratio. Algorithm 1B is about 
5.5 times faster than the original Algorithm 
1A: That “dumb” algorithm spent most of 
its time performing the critical distance 
calculations, and very little time in over- | 
head. Algorithm 2 uses the same code as 
Algorithm 1, with just a single parameter 
changed. “Smart” Algorithms 3 and 4 add 
overhead (for maintaining partial sums 
and pruning the search) to reduce dis- 
tance: computations, so they are less af- 
fected by the dramatic reduction in the 
cost of distance calculations. 
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Lesson: Tuneups to critical operations 
have the most impact on “dumb” algo- 
rithms with little overhead. 

This tuneup successfully added space 
to reduce run time. That technique was a 
sure bet in the old days, but it can mean 
a slowdown when running out of mod- 
ern caches. 

Lesson: Measure the effect of tuneups 
to ensure they indeed reduce the run time. 

Exercise 3. Write a program to deter- 
mine experimentally the run times of the 
key operations in these algorithms, such as 
loops, array accesses, and arithmetic op- 
erations. 


Tuneup C: Change Arithmetic 

These algorithms must keep sums of po- 
tentially large numbers. I’ve been bitten 
in the past by arithmetic precision (or lack 
thereof), so I instinctively made the type 
of distances the largest possible: 


typedef double Dist; 


In the past, changing the double to float 
or Jong or short would have easily given 
a huge speedup. 

Exercise 4. Estimate the result of chang- 
ing to various data types on your machine. 

There’s a little more to the job than 
changing one typedef. The first task is to 
scale distances to use the precision avail- 
able in our chosen word lengths: 


return (Dist) (DistFACT * sqrt(...)); 





Table 1: Tuneup B run times on a 
OO-MHz Pentiu 


ne 









— 


Table 3: A profile of Algorithm 4E. 
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Exercise 5. What are appropriate scale 
values? What other changes need to be 
made to the program? How would you in- 
corporate them into a program? 

Algorithms 1C through 4C incorporate 
different kinds of arithmetic. Table 2 pre- 
sents experiments on Algorithm 3C (which 
has an intermediate amount of overhead) 
with n=9 on a decade’s worth of Intel pro- 
cessors. The first six lines used the same 
16-bit executable, while the bottom line 
used a 32-bit executable. Old hardware was 
pretty fast on shorts and /ongs, and much 
slower on floats and doubles. In recent years, 
though, computer architects have used mas- 
sive transistor counts and wide data paths 
to make floating-point operations run al- 
most as fast as their integer cousins. 

Can modern programmers ignore an- 
cient hardware? Some indeed can: They 
know their programs will run only on par- 
ticular machines. Others certainly cannot: 
Their programs run today on small pro- 
cessors (in cell phones, joysticks, or even 
legacy computers), and who knows where 
they'll be used tomorrow? 

Lesson: Old hardware was relatively 
slow for floats and doubles. Modern (large) 
processors operate on such variables as 
quickly as on ints and shorts. 

For the remainder of this column, I'll 
focus on modern machines, and reckon 
Tuneup C as a blind alley. Keep it in mind, 
though, the next time you work on a small 
processor. 


Tuneups to Reduce Overhead 
The innermost loop of search1 currently is: 


for G = m-1; i >= 0; i--) { 
swap(i, m-1); 
search1(m-1); 
swap(i, m-1); 


Algorithm 3C on Intel processors. 





This consists of two calls to the swap func- 
tion and a recursive call to itself; Algorithms 
2 through 4 have similar inner loops. 

In the old days, a sure speedup could be 
gained by writing the swap function inline, 
and removing the overhead of function calls. 
This process results in Tuneup D: 


for Gi = m-1; i >= 0; i--) { 
t = plm-1]; 
plm-1] = plil; 
pli] = t; 
searchid(m-1), 
t = pli); 
pli] = plm-1]); 
plm-1] = t; 


Tuneup E pares a few additional lines of 
fat from the innermost loop by moving 
two assignments out of the loop and re- 
moving one redundant assignment: 


t = p{m-1); 

for G = m-1; i >= 0; i--) { 
p{m-1] = plil; 
pli] = t; 
searchle(m-1); 

, pli] = plm-1]; 

plm-1] = t; 


The resulting program spends the lion’s 
share of its run time in this function. 
Table 3 presents a profile of Algorithm 
4E with n=12. 

Code tuners face two nightmare pro- 
files. When the profile is entirely flat, and 
each function accounts for just 1 or 2 per- 
cent of the run time, it is hard to know 
where to start. This profile is at the other 
end of the spectrum: All the run time is 
in one function, and I’ve already squeezed 
it all I can. 


Results of the Tuneups 

I now have five tuneups across four al- 
gorithms and I’m ready to examine their 
interactions in detail. I'll start with a sim- 
ple study of compiler optimizations using 
versions of Algorithm 4 with n=12; see 
Table 4. 

With the single exception of Algorithm 
4A on the Pentium Pro, optimization is 
a win. 

Lesson: Compiler-optimized code is 
sometimes dramatically faster than unop- 
timized code. Compiler optimization may, 
however, slow down a program. | 

The compiler- optimized versions of Al- 
gorithms 4D and 4E were slower than 
(simpler) Algorithm 4C on the MIPS pro- 
cessor, yet faster on the Pentium Pro. 

Lesson: Code tuning may interact bad- 
ly with compiler optimizations, and in- 
crease the resulting run time. 

All of the other experiments in this col- 
umn have been conducted with opti- 
mization enabled. 
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(continued from page 1206) 

Our next experiment runs the complete 
set of algorithms and tuneups (with the 
exception of dead-end Tuneup C) on a 
200-MHz Pentium Pro; see Table 5. 

The result of Tuneup B was previous- 
ly discussed. Tuneup D gave a speedup 
of 13 to 30 percent (but recall that it 
caused a slowdown on a MIPS R10000). 
Tuneup E made little difference on Algo- 
rithms 1 and 2 (where run time was still 
dominated by summing distances), but 
gave a 12 percent speedup on Algorithms 
3, and 4. 

Lesson: Fine-tuning inner loops can 
still yield 10 percent here and 30 percent 
there, but sometimes slows the code. 

The overall speedup, from Tuneup A 
to Tuneup E, ranged from a factor of 6.6 
to 3.6. 

Our final experiment (see Table 6) runs 
all Tuneups on Algorithm 4, across three 
Intel processors. 

Lesson: Tuning code on old processors 
was predictable and could yield speedups 
of orders of magnitude. 

Lesson: Tuning code on modern pro- 
cessors is less predictable and yields small- 
er, but still important, speedups. 

Exercise 6. Perform additional experi- 
ments changing variables such as com- 
piler, operating system, optimization level, 
and so on. 


Principles 

Back in the old days, when code was easy 
to tune, supermarket shopping was pret- 
ty easy, too. I would walk through the 
store, putting items into my basket, keep- 
ing a running total as I went. I sometimes 
snapped up a few items on “two-for-one” 
sale. When I came to the cashier, I hand- 
ed over a coupon or two, and if the total 





price was more than a few percent off my 
estimate, I would ask to see the receipt. 

Modern marketing techniques have 
changed all that. I now go the store 
armed with more coupons. I can swipe 
my frequent-shopper card across a ma- 
chine at the front of the store to get just- 
in-time coupons. I try to take advantage 
of “buy five of these, get two of those 


Computer 
architectures have 
more accelerators, 

and compilers 
perform more 
optimizations 





free” sales when I can. When I present 
my card to the checker, I get extra dis- 
counts on items already scanned, custom- 
printed coupons for future use, and cred- 
it towards a free turkey if I rack up 
enough sales before the next holiday. The 
process results in lower prices (I sup- 
pose), but I no longer understand my 
shopping trips. 

What marketing techniques have done 
to food shopping, computer architects and 
compiler writers have done to code tun- 
ing. Pipelines and other instruction-level 
parallelism offer many instructions for the 


Table 5: Running the complete set of algorithms and tuneups on a 200-MHz 


Pentium Pro. 
ee _ os 





na 





Table 6: Running all Tuneups on Algorithm 4, across three Intel processors. 
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price of one (except when hazards jeop- 
ardize the deal). Multilevel caches usual- 
ly yield on-chip speed at DRAM prices. 
Optimizing compilers manipulate your 
code at no additional charge. These de- 
vices automate many of the tasks that used 
to fall to code tuners, and they usually do 
a pretty fine job. But some code-tuning 
tasks remain, and we have to keep a much 
larger context in mind as we set to work. 


Solutions to Selected Exercises 
Exercise 3. Brian Kernighan, Chris Van 
Wyk, and I described “An Elementary C 
Cost Model” in the February 1991 issue of 
UNIX Review. When I ran a 50-line C pro- 
gram on the 200-MHz Pentium Pro, it pro- 
duced this estimate of the run times for 
various operations: 


Operation Nanosecs 
Null Loop 

{} 36 
Int Operations 

il++ —5 

i1=i3++ —1 

il—i2 + i3 3 
Control Structures 

if G==5)il++ 0 

while Gi<0)i1++ 3 
Array Operations 

pliJ=i 40 

i2=pli] 20 

swap(0,i) 160 
Square Root 

fl=sqrt(f2) 888 


The first line says that a null loop requires 
about 36 nanoseconds per operation. The 
second through fourth lines show the costs 
of arithmetic operations on integers. 

Lesson: Adding instructions to a small 
inner loop can make it run faster. 

The array operations are more sub- 
stantial, and the square roots are relative- 
ly expensive. 

Exercise 5. This code sequence as- 
sumes that precisely one of the variables 
SHORT, LONG, FLOAT, and DOUBLE is 
nonzero, and that variable signifies the 
type to use: 


#ifdef SHORT 

typedef short Dist; 
#define INF 30000 
#define DistFACT 100 
#elif LONG 

typedef long Dist, 
#define INF 1000000000 
#define DistFACT 100000 
#elif FLOAT | | DOUBLE 
typedef float Dist; 
#define INF ((Dist) 1e35) 
#define DistFACT 1.0 
#endif 


Scaling and rounding may slightly change 
the output sequence, but the tour distance 
should be very close to the true distance. 


DDJ 
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DR. ECCO’S OMNIHEURIST CORNER 


Rosetta 


Dennis E. Shasha 


he woman said, “If our friends in the 

FBI are telling the truth, this group is 

sending terrorism instructions over 

the Web.” Our visitor had the in- 
quisitive eyes of a mathematician, mixed 
with the reticence of a National Security 
Agency officer. She was also Ecco’s long- 
time friend Karmen Simon. Ecco knew too 
well that he shouldn’t ask questions. 
Whereas he had had trouble with the NSA 
in the past (see my book Codes, Puzzles, 
and Conspiracy) his friendship with Kar- 
men overwhelmed his revulsion for the 
institution. 

“They are not very sophisticated,” Si- 
mon continued. 

“A cursory analysis shows a single sub- 
stitution code. But the problem is that the 
message is spread on a bunch of web 
pages. These have titles coming from dried 
fruits: raisin, date, fig, prune, apricot, 
pineapple, grapefruit, currant, and co- 
conut. We believe the recipient is meant 
to lay out the pages in a specific order. 
The first question is which order? 

“Our FBI friends tell us that this group, 
calling themselves ‘Remember Waco’— a 
group of antifederalists— normally work 
from a template and we want to exploit 
that fact.” 

“What do you mean by a template?” 
Liane asked. 

“Well, we have a kind of hyperrosetta 
stone,” Simon responded. 


Dennis, a professor of computer science at 
New York University, is the author of The 
Puzzling Adventures of Dr. Ecco (Dover, 
1998), Codes, Puzzles, and Conspiracy 
(W.H. Freeman & Co., 1992), Database 
Tuning: A Principled Approach (Prentice 
Hall, 1992), and (coauthored with Cathy 
Lazere) Out of Their Minds: The Lives and 
Discoveries of 15 Great Computer Scien- 
tists (Springer Verlag, 1998). He can be 
contacted at DrEcco@ddj.com. 
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“We know that the graph has nine 
pages numbered consecutively: 1 2 3 45 
6789. The message is laid out in nu- 
merical order on those nine pages. Ac- 
cording to our informants, the hyperlinks 
are supposed to link the pages as follows: 


Wi ON WW GH \O FR CO CO Lv 
NED BD SO NI RRO ODD 


ee oon 
Dun pw 


“Each row represents a hyperlink. 

“So, for example, page 2 has an href to 
itself and to page 6. At the same time, page 
6 has two hyperlinks to page 2 and page 
5 has one. That’s the classic arrangement. 

“The hrefs we actually see, however, 
come in an entirely different form: 


(fig prune 
grapefruit grapefruit 
apricot currant 
pineapple apricot 
raisin Currant 
grapefruit currant 
apricot pineapple 
pineapple grapefruit 
pineapple date 

fig prune 

pineapple raisin 
coconut grapefruit 
raisin Currant 

raisin coconut 
prune fig 





date fig 

raisin prune 
grapefruit pineapple 
prune prune 
currant fig) 


“So, the first problem is to determine 
which dried fruit corresponds to page 
1, which to page 2, and so on. All we 
know for sure is that the edges should 
correspond at least approximately, so if 
apricot is page 1, then apricot should 
point to two pages, corresponding to 
pages 4 and 5. 

“What confuses things is that two hrefs 
may have been reversed according to our 
informants. We want to identify those two 
if possible. 

“The second problem is to decode the 
message. Some think it’s better to start with 
the decoding, under the theory that the de- 
coding will help the ordering. I don’t know. 

“Since they change the messages more 
often than the links, we are certainly in- 
terested in the ordering in any case. 

“Here is the text associated with each 
page: 

apricot: “rrmlbgvp” 
coconut: “6nlm4bgu” 
currant: “rmaromvx” 
date: “8nb8vmhw” 

fig: “xu6gllm7” 
grapefruit: “wdddm3r8” 
pineapple: “un6mx6m6” 
prune: “rmx5Smram” 
raisin: “a7m9rgm9” 


“Assuming you can order the pages, the 
message in ciphertext is just the concate- 
nation of the messages in each page. We 
know from our agents that the code is a 
single substitution code in English map- 
ping from all lowercase letters, all digits, 
and the space character to the same al- 
phabet, viz all lowercase letters, all dig- 
its, and the space character. In this case, 
however, the encrypted text has no space 
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character. That may or may not mean that 
the clear text has no blank spaces.” 

“The basic problem, Ecco, as I see it, is 
approximate graph isomorphism,” I vol- 
unteered. 

“It’s perhaps made easier by the fact 
that some edges are represented multiple 
times. But the basic principle is the same. 
We want a mapping from fruit to page 
numbers and then we want to do a de- 
coding.” 

“Quite right, Professor,” Ecco said to me. 

“I understand that your field hasn’t quite 
decided how difficult even exact graph 
isomorphism is.” 

“True,” 1 admitted. “No efficient, good 
algorithms are known, but the problem is 
not known to be hard either.” 


Reader: Your job is to solve these three 
problems: Find the correct ordering among 
the pages, identify the reversed edges if any, 
and then decrypt the message as much as 
possible (Hint: There are three 2s, two Xs, 
and av in the license plate). 


Ecco and Liane worked on the prob- 
lem for many minutes. As Ecco handed 
the written solution to Simon, he smiled 
and said, “Personally, I like dates better 
than figs.” 


Last Month’s Solution 

The solution to the “Joints In Space” prob- 
lem required Ecco and Liane to propose 
a rectanguloid design consisting of a 4x4x3 
arrangement of the cubes. 

Given this, each cube can be identified 
by three coordinates. For example, 
(4,3,1,E) means that company E is in po- 
sition (4,3,1). 

Here is an assignment that obeys Tyler's 
original constraints: 


(1,1,1,A) (1332) 
(2,1,1,A) (2,3,2,A) 
(3,1,1,A) (3,3,2,0) 
(4,1,1,B) (4,3,2,E) 
(1,2,1,A) (1,4,2,F) 
(2,2,1,A) (2,4,2,E) 
(3,2,1,C) (3,4,2,E) 
(4,2,1,C) (4,4,2,K) 
(1,3,1,A) G33.) 
(2,3,1,D) (2,1,3,A) 
(3,3,1,2) (3,1,3,M) 
(4,3,1,E) (4,1,3,M) 
(1,4,1,F) (1,2,3,L) 
(2,4,1,D) 223.) 
(3,4,1,G) (32,3, 
(4,4,1,G) (4,2,3,E) 
(1,1,2,H) (13:3,5) 
(2,1,2,A) (25.3.0) 
€5.102,45) (3,306) 
(4,1,2,C) (4,3,3,E 
(1,2,2,D (1,4,3,N) 
(2,2,2,A) (2,4,3,E) 
(3,2,2,C) (3,4,3,E) 
(4,2,2,C) (4,4, 3,0) 
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Reader Notes 

Natasha’s “Dig” problem (DDJ, February 
1999) engendered many clever respons- 
es. For this problem, however, machine 
triumphed over (wo)man in the sense 
that the best solutions I received were all 
programmed by some kind of search pro- 
cedure. 

The following all suggested an im- 
proved approach over Dr. Ecco’s: Jon 
Beal, Jimmy Hu, Ted Alper, Richard W. 
Lipp, Michael Williams, Dave Weiblen, 
Pearl Pauling, Jim Greer, Dan Hirschberg, 
Paul DeMarco, Bob Harris, Jean-Francois 
Halleux, Christian Tanguy, Sam A. Vir- 
gillo, Dr. Burghart Hoffrichter, Christo- 
pher Oliver, Allan Vasenius, Charles Tay- 
lor, Koos du Toit, Martin Brown Kent 
Donaldson, Onno Waalewijn, Yves 
Piguet, Michael S. VanVertloo, Rodney P. 
Meyer, Benjamin C. Chaffin, Serguie 
Patchkovskii, Bharat Chandramouli, Hans 
Knorr, and Christopher Mills. The best 
answers in the case in which all items 
must have a positive integer label came 
from Ben Chaffin. 

Ben’s labeling when at most three ob- 
jects can be taken was: 2, 4, 8, 16, 32, 60, 
73, 116, 207, 230, 341, and 452. This yields 
a trim maximum sum of 1023. (Ecco’s so- 
lution was quite a bit worse at 1401.) Ben’s 
positive integer labeling for up to four ob- 
jects yielded a maximum sum of 2550: 16, 
18, 19, 20, 24, 32, 64, 128, 237, 420, 712, 
and 1187. 

Many readers thought negative num- 
bers wouldn’t help yield a smaller maxi- 
mum, but a few intrepid readers showed 
this to be wrong. The basic algorithm is 
to use negative numbers and then take 
the absolute value of any resulting sum. 
This guarantees a nonnegative answer, of 
course. The real cleverness is to ensure 
unique decodability. Dan Hirschberg 
showed that when at most three choices 
can be made, then negative numbers can 
reduce the maximum sum to 734: 2, 3, 4, 
8, 16, 30, 56, 110, 173, 244, 317, -626. 

In the best solution, Ted Alper then 
showed that negative numbers can reduce 
the maximum sum of 1527 for up to four 
choices: Alper: 2, 3, 4, 8, 16, 32, 61, 116, 
224, 416, 771, -1468. 

Since I know of no proof of minimali- 
ty for any of these sequences, a clever 
reader may find a better solution than the 
ones mentioned earlier. For that reason, 
I’ve written a checking program in K that 
I will send to anyone who wants it. The 
program takes a sequence and the num- 
ber of choices that can be made and sees 
whether the sequence results in the unique 
decodability of subset sums. 


DDJ 
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WHAT DOES IT TAKE 
TO MAKE SOFTWARE 


DEVELOPMENT 


MORE PRODUCTIVE? 


Computerworld 
finds that users 
prefer Starfeam® 
collaborative 
development 
from StarBase. 
In their April 6, 1998 issue, 
Computerworld asked StarTeam 
users and version control soft- 
ware users to discuss product 
capabilities and potential. The 
results speak for themselves: 
e “T got more done in one day 
with StarTeam than I did in two 
weeks with the other products 
we evaluated.” 


e “Part of StarTeam’s beauty is 


intersoly's PYCS 
presents @ tough ance and end users 
Jearning curve 


YOu 


ted 


Eight users of two different types of team 

development/confiquration management tools 
report: The products have some challenges but are 
essential to a smooth development process 


By Kevin Burden 


rowing interest in team develop- 
ment adds a new clump of chal- 
lenges to application develop- 


ment. The risk of team members 
stepping on one another's toes increas- 
es with the number of developers shar 
ing files 

Enter software configuration manage 
ment tools. 

Configuration management, which 
also is called change management, is a 
catch-all name for tools that address the 
challenges of team development. Those 
challenges include knowing which files 
are being worked on and by whom; 
needing to roll back to previous file ver- 
sions; tracking bug histories; and com- 
municating with other team members 

Many development environments 
come with embedded tools to handle 
some of those tasks. But the problems 
with embedded tools become apparent 
when teams using multiple develop- 
ment environments work on the same 
project, says Al Smith, a senior systems 
analyst at T. Rowe Price Investment 
Technologies, Inc. in Baltimore 

The _ tools 
don’t always 
mesh and 
there is no 
common view 
of the  proj- 
ect’s flow 
More impor 
tant, reliance 
on such tools 
may exclude 

® key nonpro- 
grammers 
groups such 
as marketing, 
quality assur 


That's where third 
party programs such 
as Intersolv, Inc.’s PVCS and StarBase 
Corp.'s StarTeam are intended to add 
value. They take different approaches to 
configuration management but are sin- 


gled out by analysts as examples of the 
products in the sector. 

PVCS comprises several products 
that address different aspects of change 
management and are sold separately. 
StarTeam comes as a fully integrated 
suite under a single interface. 

Both products are intended to adapt 
to your development process, which 
tells developers what files they can work 
on and when; neither is capable of set- 
ung one. 

These tools “can pull you out of a 
bind and fix problems. But without a 
process, you'll be in binds most of the 
time,” says Beth Ouellette, director of 
quality and enabling at The Prudential 
Insurance Company of America in 
Newark, N.J. 


USER VIEWS 

Computerworld asked four PVCS users 
and four StarBase users to discuss the 
products’ capabilities and potential 


EASE OF USE 
As more nondevelopers take integral 
roles in the development life cycle, ease 
of use becomes critical for tools. The 
makers of StarTeam know this, and it 
shows 1n its intertace, users say. But In- 
tersolv didn’t give ease of use the same 
attention, according to its customers 
even experienced developers say PVCS 
is tough to use. Some of that can be ex 
plained by the two vendors’ different 
philosophies toward project manage- 
ment. Intersolv’s PVCS relies more on 
centralized control to deal with prob- 
lems; StarBase emphasizes team collab- 
oration 
“I got more done in one day with 
StarTeam than I did in two weeks with 
other products we evaluated,” says Todd 
Mancini, principal software architect at 
One Source Information Services, Inc 
in Cambridge, Mass. 
Mancini says StarTeam seamlessly in 





tegrates with his different development 
environments (Microsoft's Visual C++ 
and Visual Basic), but he prefers to use 
StarTeam’s interface over those of his 


wove oe COMPUTERWORLD 


mK have rk 


development tools 

“All I do in C++ now is code. Every- 
thing else check in, check out, pro- 
ject management — I do in StarTeam’s 
interface,” he says 

Part of StarTeam’s beauty is that all of 
its functions are integrated under one 
interface, compared with PVCS, whose 
products are separate 

That integration leads to functional 
advantages. For example, bugs found 
through the defect tracking program 
can be attached to the exact problem 
file, helping quality assurance teams 
know what bugs to test for. “I've not 
seen another product that can do that,” 
says Mike Sly, technical manager at The 
Reynolds and Reynolds Co. in Dayton, 
Ohio. 

PVCS users clearly don’t share the 
same enthusiasm for its ease of use. “It 
has a horrible Windows interface,” says 
Harsh Kalra, a senior programming an- 
alyst at T. Rowe Price. Kalra says PVCS 
works “fabulously” through the com 
mand line, but it could take weeks for 
contractors and new programmers to 
come up to speed. 

“We looked at PVCS but noticed it 
leaned too much toward the techie 
type,” says Starleam user Capt. Keith 
Kocan, program manager at the Stan 
dard Systems Group in the U.S. Air 
Force in Montgomery, Ala. “The people 
that put together our user manuals 
need configuration management to co- 
ordinate all the documents, but they 
wouldn't be able to understand PVCS.” 





COLLABORATION 
StarTeam users communicate through 
threaded conversations. One team 
member starts a discussion by sending 
an E-mail message through StarTeam’s 
interface. StarTeam then automatically 
draws a relationship between the thread 
and project and tracks the initial mes- 
sage and its responses in a topic tree 
Besides facilitating conversation, 
StarTeam documents those conversa 
tions in a central repository. “If some- 
one has the same problem months lat- 


er, 
they @ 
can look 
up the con- 
versation and 
not have to go 
through the same steps again,” Sly says 
PVCS doesn't provide a means to 
document conversations. But it does of- 
fer a product called Tracker, which or- 
ganizes and manages project issues 
such as feature requests, defect reports 
and other changes in a database format 
Developers can see the issues sur- 
rounding a project, but PVCS doesn't 
provide a way for them to communi 
cate. 
None of the four PVCS users inter- 
viewed are using Tracker. 
Burden is a Computerworld features 
writer. 


| The challenges of 
| team development 


Configuration management software aims 
to meet the challenges faced by team 
leaders and members alike 


Team leaders’ concerns: 

>» How can we capture all the 
project-related information? 

>How do we manage widely 
dispersed teams? 

>How do we track the progress 
of our development efforts? 

>How can we tell when a prob- 
lem has been resolved? 

>How do we know when 
the project is ready for 
testing, quality assurance 
and production? 


Team members’ concerns 
>What changes are assigned to 
me? 
» What are the priorities for 





making all these changes? 
>How do | inform others | have 
finished a change? 
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that all of its functions are integrated under one interface.” 


e “As more nondevelopers take integral roles in the development life 


cycle, ease of use becomes critical...The makers of StarTeam know 


this, and it shows in the interface, users say.” 


StarTeam encourages collaboration with features such as 


Location Transparency, so team members can access any type of 


file, from wherever they are. Use our Windows client, our Java 


client or your browser across LANs, WANs, the Web or the Internet. 


Visual Configurations eliminate the error-prone, time-consuming 


HYALNAD MAIAAY 


process of using labels. 
Generalized Linking lets you 
link all the items in your 
repository in any number of 
ways to preserve context. Our 
File Management Interface 
preserves the structure of your 
projects. And our integrated 
Defect Management and 
Threaded Conversations 
capabilities dramatically in- 
crease your team’s productivity. 

For current users of version 
control software, StarTeam 
provides a Collaborative 
Framework that interoperates 


transparently with PVCS, Visual 


SourceSafe or StarBase Versions™ archives. So now you can build 


Team Productivity on top of what you already have. 


But don’t take our word for it. Call us today at 888.STAR700 


or visit www.Starbase.com and we'll send you a reprint of the 


complete Computerworld article. Because when it comes to 


collaborative development, we let our users speak for themselves. 


WstarBasee 


THE FUTURE OF TEAM PRODUCTIVITY IS HERE. 
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PROGRAMMER’S BOOKSHELF 


A Revolution 
Oft-Delayed 


Gregory V. Wilson and William Stallings 


t's been a long time since I devoted an 

entire review to a single book. One rea- 

son is that comparing and contrasting 

two or three books gives me a chance 
to say something about their context, and 
about the wider state of modern com- 
puting as a whole. Another is that there 
just isn’t enough in most books to merit 
500 or 600 words of discussion. 

Clemens Szyperski’s Component Soft- 
ware: Beyond Object-Oriented Program- 
ming certainly has enough in it to carry 
an entire review. In fact, this superb book 
has made me see the past, present, and 
near-future state of our industry in a new 
way. Szyperski not only provides an in- 
depth exposition, comparison, and cri- 
tique of today’s three major component 
standards (COM, CORBA, and JavaBeans), 
he also describes what it will take for those 
technologies to deliver on the promises 
that were made for object-oriented pro- 
gramming in its early days. 

The second sentence of Component Soft- 
ware defines its subject area. According 
to Szyperski: 


...software components are binary units of 
independent production, acquisition, and 
deployment and interact to form a func- 
tioning system. 


The “Foundation” section of the book 
(Chapters 4 through 11) is a very detailed 
exploration of why all of the adjectives 
in that defining phrase are necessary. 
Components must be independent and 


Greg is the author of Practical Parallel Pro- 
gramming (MIT Press, 1995), and coedi- 
tor with Paul Lu of Parallel Programming 
Using C++ (MIT Press, 1996). Greg can 
be reached at gvwilson@interlog.com. 
William’s most recent book is SNMP, 
SNMPv2, SNMPv3, and RMON 1 and 2, 
Third Edition (Addison Wesley, 1999). He 
can be contacted at ws@shore.net. 
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binary to allow for nulipie independent 
vendors and robust integration; unlike 
object-oriented programming, which is 
primarily concerned with the production 
of software, component systems must deal 
with deployment issues, such as installa- 
tion, versioning, error handling, and cross- 
platform compatibility. 

The “State of the Art” section (Chapters 
12 through 19) looks in detail at where 
component technology is today. These 
chapters are worth the price of the book 
by themselves: I have rarely read as clear, 
as incisive, or as even-handed a compar- 
ison of technological alternatives. Szyper- 
ski is clearly on intimate terms with COM, 
CORBA, and JavaBeans; he not only points 
out their strengths and weaknesses, but 
also explains why their designers made 
the design choices they did. The last two 
sections of the book are “The Next Gen- 
eration” (Chapters 20 through 25) and 
“Markets and Components” (Chapters 26 
through 28). These are necessarily more 
speculative, but are still solidly grounded 
in the needs of real-world, large-scale in- 
dustrial applications. 

The “Foundation” section was the most 
illuminating for me, but also the hardest 
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work. It starts with Chapter 4, which de- 
fines what components are and are not. 
Some of these definitions may seem like 
hair-splitting, but each distinction or fine 
point turns out to have been made for a 
reason. Chapter 5, “Components, Inter- 
faces, and Re-Entrance,” explains why 
component-based programming is hard 
to get right. One of the reasons is that it 
is hard to specify software components at 
the same useful level of detail as electronic 
or mechanical components are specified. 
Another, more fundamental, reason is that 
callbacks and extensions mean that use- 
ful software components are rarely lay- 
ered as cleanly as purists would like. The 
examples in this chapter are occasionally 
hard to follow, but that isn’t Szyperski’s 
fault: Simple examples just don’t show the 
problems that real-world systems en- 
counter. 

Chapters 6 (“Polymorphism”) and 7 
(“Object versus Class Composition”) look 
at the idea of substituting components for 
one another. What does it take to make 
this possible? To make it safe? To make it 
economical? What happens when compo- 
nents evolve? Is multiple inheritance a nec- 
essary evil? If so, what kind of multiple in- 
heritance? Szyperski points out that this can 
mean several different things, some of 
which are more or less necessary, or more 
or less evil, than others. 

Each of the 13 sections in Chapter 8 dis- 
cusses one aspect of scaling and granu- 
larity. Components are units of abstraction, 
of accounting, of analysis, of compilation, 
and so on down the alphabet to mainte- 
nance and system management. This care- 
ful enumeration of what components are 
good for is used in the later discussion of 
COM, CORBA, and JavaBeans to analyze 
what each system does and does not 
provide. 

I could go through the contents of the 
other chapters at this point, but it would 
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be much better for you to go out and buy 
the book yourself. Szyperski’s English may 
sometimes sound a bit odd, and some of 
the things he says about JavaBeans are al- 
ready a bit out of date, but these are very 
minor quibbles. Component Software is 
quite simply the best book on computing 
I read in 1998, and deserves a wide, at- 
tentive audience in both industry and 
academia. 


—G.V.W. 


he Practical Performance Analyst, by 
Neil J. Gunther, is a superb book that 
should be on the shelf of every pro- 
grammer, engineer, systems analyst, 
and manager who is responsible for per- 
formance analysis and design of comput- 


Dr. Dobb's Systems 
Internals CD-ROM 





le you have been looking for the inside 


scoop on Windows NT and how it 


functions, then look no further! The Dr. 


Dobb's Systems Internals CD-ROM is a 


complete resource for people who need 
to gain a better understanding of how 


Windows systems operate. Network 


administrators, software developers, and 
users alike stand to benefit from this release. The Systems 
Internals CD-ROM contains source code and executables for 


freeware and shareware utilities. 


er systems or data networks. It is one of 
the best books on performance analysis I 
have ever encountered. 

The Practical Performance Analyst does 
not assume a background in performance 
analysis or queuing theory. Instead, its 
emphasis is on the practical use of mod- 
eling tools, rather than on the mathemat- 
ical theory underlying those tools. How- 
ever, although the book is a comparatively 
easy read, it does require a commitment 
on the part of the reader. To benefit from 
the book, you need to give it careful study. 

Gunther begins, surprisingly, with a 
chapter on time. There turns out to be a 
lot to say on this subject, including the 
distinction between discrete and continu- 
ous time, types of clocks, time scales, how 
to define and measure response time, and 
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various lifetimes, such as time to failure. 
Having all this material gathered in a sin- 
gle place is quite useful. The next two 
chapters cover queuing theory. In keep- 
ing with the book’s practical orientation, 
Gunther does not derive the queuing 
equations, but spends all of the time pre- 
senting the results and discussing their ap- 
plicability. For someone with little or no 
previous exposure to queueing theory, 
these two chapters provide an excellent 
introduction, sufficient for the practical 
problems of performance analysis. 

Perhaps the only significant lack in The 
Practical Performance Analyst is that it 
does not deal with self-similar behavior. 
In recent years, a number of studies have 
shown that network traffic often shows a 
self-similar pattern rather than the Poisson 
(or random) pattern that is typically as- 
sumed in the queueing model. In gener- 
al, performance is worse when self-simi- 
lar behavior is present. However, no 
convenient modeling approach has yet 
evolved for dealing with this phenomenon, 
so Gunther’s omission of this topic is un- 
derstandable. 

The middle section of the book shows 
how to apply queueing analysis to design 
problems. Topics covered include sym- 
metric multiprocessors (SMPs), computer 
clusters, client-server applications, and 
web servers. In each case, Gunther goes 
through the steps needed to perform an 
analysis and provides a number of de- 
tailed examples. This concrete approach 
gives the reader the confidence and the 
tools to handle his or her own specific de- 
sign problems. 

The final, and most difficult, part of the 
book delves into more advanced topics. 
Here the book is concerned with those 
difficult modeling situations in which there 
are unstable configurations or large tran- 
sients to deal with. In some design prob- 
lems, these factors cannot easily be ig- 
nored, so this material is an important 
element of the book. 

Finally, The Practical Performance An- 
alyst includes a CD-ROM with a package 
of portable C routines called the “PDQ” 
(short for “Pretty Darn Quick”) toolset. With 
PDQ, you can quickly set up a perfor- 
mance model for testing design alterna- 
tives. The only requirement is a knowl- 
edge of C. The great advantage of PDQ is 
the speed with which it can be used. In 
today’s commercial environment, there is 
often little time budgeted for performance 
analyses. Indeed, the whole point of The 
Practical Performance Analyst is to make 
it possible to effectively do performance 
analysis while keeping up with the short 
deadlines the analyst typically faces. 


—_WS. 
DDJ 
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Easily add professional bar 
coding capabilities to Windows 
3.1, 95 and NT applications. 


Royalty-Free. 


Create extremely high quali 
device independent, WMF 
graphics. Not fonts! Not bitmaps! 


www.taltech.com/ddj.htm 


The Virtual Bookstore 
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» TE Edit Control (Advanced RTF control) 
DP HTML Viewer/Editor Add-on for TE 

D ReportEase Plus (report writer engine) 
D SpellTime DLL and dictionary 


ARE ROYALTY FRE 


AND AVAILABLE FOR 
16 OR 32 BITS. 


D FormPlus (form designer/filler) 
D Rich Text Grid control and ChartPro 


Demos: www.subsystems.com 


SUB SYSTEMS, INC. 
11 Tiger Row, Georgetown, MA 01833 


978-352-9020 Fax: 978-352-9019 
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ATL COM 
Programmer’s Reference 


e The “Cliff Notes” to ATL for 
C++ programmers arose 
e Covers COM with extensive references, 
descriptions, examples, and notes. 
PROGRAMMER TO PROGRAMMER ™ 


Author: Grimes, 1-861002-49-1, $29.99 Visit your local 
bookstore or view a full TOC and sample chapter at 
www.wrox.com 1-800-USE-WROX. 





www.dinkumware.com 
+1 888 4DINKUM 


Dinkum Abridged 
for Windows® CE 


Just the right size library 
for embedded systems. 


Dinkumware, Ltd. 
Genuine Software 





mime ++ 
“a most complete and essential class library” 


... and around 
the world! 


Licensed by 
Fortune 500 companies ... 


¥ document object model for MIME 

¥ C++ library 

fully standards compliant (RFC 822, 2045, 2046, & more) 
¥ SMTP, POP, NNTP 

source code available 


Hunny Software @ (301) 948-6999 
www.hunnysoft.com/MIMEPP 





The best cross-platform 
compression libraries 
for Win32, Win16, 
DOS, OS/2, Unix, 
Macintosh, and 
embedded systems. 


Robust 45-function AP] compresses 
buffers, files, archives, disk spanning, 
encryption, self-extr. EXE's and more. 


FREE DEMO Call 1-800-775-1073 
DC Micro Tel (678) 442-1623 


Fax (678) 442-1819 
a> Development www.dcmicro.com 


a 46 immer media 603-465-3216 


SERVER-LESS VERSION CONTROL 





ZIP TOOLS 


Easily add royalty-free data compression to 
all of your Windows applications with: 


DynaZIP- Active 


Delivery” 
Compression Tools Self-Extract Zip Tools 


ActiveX/DLL/VCL interfaces, full samples & doc’s. 
Most reliable components, millions in use daily. 


Fully supports Active Server Page(ASP) websites. 


New $149 ActiveX version available, great value! 
Download your free eval copy! 


www.innermedia.com 
800-962-2949 (USA) 











Predict Software Speedup 


FREE software accurately predicts the 
code speedup possible from 
parallelization. 


Download at 


http://www.myrias.com/predictor/free 
Call (780) 435-1000 


Myrias Software Corp. 


NEW! 


Code Co-op... Version 2.0 


The versatile Version Control System for 
collaborative development 


® Synchronization using email, local network, floppy disk 
* Intuitive GUI -- check-in, check-out, synch, visual diff. 
* Fully functional trial version available for download 


www.relisoft.com 


Reliable Software, 


Smart Tools for Smart Programmers,, 


The best compression 


a = 
om EP | controls for Windows : 


developers. 


© Xceed Zip Compression Library v4.0 
® Xceed Zip Self-Extractor Module 
® Xceed Backup Library 


Get the fully functional trial versions at 


www.xceedsoft.com 


1-800-865-2626 
1-450-442-2626 


Xeeed Software Inc. 
info@xceedsoft.com 


- documentation from | 
comments in your 


» New -Version 2.1. 


p> Generates documen- 
tation. directly from the 
source code. 


Pm Extracts comments. 
p> User customized 
reports formats. 
> HTML, WinHelp, 
RTF. 


> FREE working 
evaluation at 
1-888-646-1933 www.bbeesoft.com 


Bumble Bee Software 
P.O. Box 2007 RK 
Westford, MA 01886 


info@bbeesoft.com 


Generate Documentation 
from your source code with DocJet ! 


Produce HTML, 


MSHelp, and You can fine- 


MSWord : tune your output 


with DocJet’s 
i | WYSIWYG 


code - and you won’t I Loe output editor. 


need to change your 


_ commenting style! 


\ 
FREE TRIAL VERSION! 


http://www.tall-tree.com 
info@tall-tree.com 512-453-4909 


. ‘ : ee 
Earn B.S. and M.S. in Computer Science AMERICAN 

* NEW B.S. program in Information Systems |NSTITUTE 
2 ae aa 


¢ Distance Education 
* Object oriented B.S. program ees 
(EY BRAS 


¢ Approved by more than 275 companies 

¢ Follows ACM/IEEE guidelines STATE LICENSED 

* Thousands of students throughout U.S. ciccaeie elie 
ACCREDITED 


Free catalogue 1-800-767-AICS World Association 


of Universities 


or www.aics.edu and Colleges 
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VICTOR 


Image Processing Library 


Fast BMP, TIFF, PCX, GIF, TGA, PNG, JPEG. Adjust 
brightness, contrast, sharpen, create filters, resize, rotate, 
+more of single image, multiple images, or any image area; 
color reduction to optimum, specific, or std. palette; print; 
scan; crop, combine, compare, blend images. 


DOS $199, 16-bit DLL $299, 32-bit DLL $499 


Catenary Systems 
314-962-7833/fax: 314-962-8037 
www.catenary.com/victor 
ask for free demo src avail visa/mc/c.o.d. 


Linux * OS/2 * Solaris * NT * HP-UX * MacOS « AIX 


IM Rabel ¥ 


Build professional Java apps ,, 
visually on the platform _ ya 
of your choice. 





AVA and dBase Programmers 





Don’t Throw Away Your dBase 
Files 


xBaseu is a collection of Java classes that reads, writes, 
and updates dBase Ill and IV dbf, dbt, ndx and max files. 
Only $95.00 and it’s royalty-free. 


www.americancoders.com 


American Coders, Ltd. 
Post Office Box 97462 
Raleigh, NC 27624 
919.846.2014 


PROLOG TOOLS 


Create Prolog 


Components 


Diagnose, advise, configure and plan 
with the Amzi!® LogicServer™ tools & 
libraries (DLLs) for C/C-+, Java, VB, 
Delphi, Web Servers & more. Win NT 
95 3.x & Solaris. Use ODBC, Sockets, 
Unicode & new OOP extensions. 
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SECURITLY 


o*InterLok 


Copy Protection * Electronic Software Distribution * License Management 





Features: 
* Wraps Your Software in Minutes 
* Try-Before-You-Buy & Immediate Purchase 
* Software Metering & Rentals L 
* Windows & Macintosh Compatibility 
* Foreign Language Support 
* Key Diskette Option 

Sales: {408} 297-7444 ext.} 


www.paceap.com 


© 1998 PACE Anti-Piracy. All rights reserved worldwide. interLok is @ trademark of PACE Anti-Piracy. 
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C and C++ DOCUMENTATION TOOLS (v. 7.0) 


Graphic-tree of caller/called function hierarchy, cross-reference, 
file/function index. 


Creates/inserts/updates comment-blocks (functions/identifier 
used) for each function. 


Calculates path complexity, counts lines with comments, code, 








: ee ee 'C' statements. 
Ee ee : “ia 2 ee si 2 sce Lists and action-diagrams, or reformats source into user-selected 

Mat si Na wet th i eS a standard formats. 

Creates cross-reference of local/global/define/parameter identifiers. 


All 5 programs integrated as DOS program. <10,000 
lines. C-BROWSE Windows graphic-tree viewer. 


DOS, Windows, 0S/2, 1,000,000+ lines 





et 
haaraitehy 
tian isoumnenn 


ee . SOFTWARE BLACKSMITHS INC. email @ swbs.com 
ee 6064 St Ives Way, Mississauga —_ Voice/Fax 


££ }£}£}§£#&~. | ONT Canada L5N-4M1 http://www.swbs.com 





He 





ML) te 





The Practice of Programming, a new book 
by Brian W. Kernighan and Rob Pike, 
underscores the fundamental point that pro- 
gramming involves more than just writing 
code. As a working programmer, say 
Kernighan and Pike (both members of the 
technical staff at Lucent Technologies’ Bell 
Labs), you must also assess tradeoffs, choose 
among design alternatives, debug and test, 
improve performance, and maintain soft- 
ware written by yourself and others. To this 
end, The Practice of Programming UISBN 
0-201-61586-X) offers practical advice and 
real-world examples in C, C++, Java, and a 
variety of special-purpose languages. You 
can find out more about the book, which 
retails for $24.95, by going directly to 
http://cseng.awl.com/bookdetail.qry/7ISBN=0 
-201-61586-X&ptype=0. 

Addison-Wesley Longman Inc. 

Computer & Engineering Publishing Group 
One Jacob Way 

Reading, MA 01867-3999 

781-944-3700 
http://www.awl.com/cseng/ 


Software Emancipation Technology has 
announced Discover 7.0, tools for soft- 
ware quality process control and man- 
agement. Discover analyzes source code 
and creates a database of information 
that captures the interrelationships be- 
tween the entities in the code base. This 
database is used to better understand the 
code and help manage the software- 
development process. Included in Dis- 
cover 7.0 is support for C/C++, Java, Em- 
bedded SQL, PL/SQL, quality assurance 
tools, tight integration with Microsoft Vi- 
sual Studio and Developer Xpress, and 
a desktop package for rapid source-code 
comprehension. Discover 7.0 with asso- 
ciated service packages costs $3000.00, 
it runs on SunOS, Solaris, HP-UX, IRIX, 
and Windows NT. 

Software Emancipation Technology Inc. 
15 Third Avenue 

Burlington, MA 01803 

781-359-3300 

http://www.setech.com/ 
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InSpeck-3D is a computer-aided, noncontact 
optical 3D digitizer that measures the 3D 
form and texture of a given surface from In- 
Speck. Models can then be imported into 
3D modeling and animation software. In- 
Speck-3D can be ordered as a color or black 
and white digitizer, and can acquire texture 
and 3D coordinates of up to 300,000 points 
in 0.3 seconds. It uses a halogen white light 
source, and supports Softimage’s Softimage 
3D and Kinetix’s 3D Studio MAX. InSpeck- 
3D costs between $19,500 and $35,000, de- 
pending on the application. 

InSpeck Inc. 

4750 Henri Julyan 

Quebec City, PQ 

Canada H2T 2CA 

514-284-1101 

http://www.inspeck.com/ 


InstallShield has released InstallShield for 
Windows CE. InstallShield lets you create 
installation code for desktop-to- device, 
Internet-to-device, and PC-(storage card)- 
to-device. For the desktop-to-device sce- 
nario, InstallShield is tightly integrated with 
InstallShield 5.5 Professional Edition. In- 
stallShield for Windows CE 1.0 costs $495.00 
and is available from InstallShield’s web site. 
InstallShield Software Corp. 

900 National Parkway, Suite 125 
Schaumburg, IL 60173 

847-240-9111 
http://www.installshield.com/ 


Teamtrack 3.0 is the most-recent version 
of Teamshare’s web-based problem track- 
ing system for software-development 
teams. With Teamtrack, you can track and 
prioritize defects, customer requirements, 
change requests, and other issues, all from 
a web browser. New features include fold- 
ers, version-control integration, threaded 
notes, and remote administration. Team- 
track 3.0 sells for $499.00 for a single user 
license, with volume discounts available. 
Teamshare Inc. 

1975 Research Parkway, Suite 105 
Colorado Springs, CO 80920 
719-599-4444 


http://www.teamshare.com/ 


Harlequin has released Harlequin Dylan 
Enterprise Edition, a Windows-hosted de- 
velopment environment based on the Dy- 
lan language. Harlequin Dylan Enterprise 
Edition incorporates CORBA support and 
includes an IIlOP-compatible ORB. 
Harlequin Dylan Enterprise Edition costs 
$799.00. A Personal Edition is available 
free of charge from Harlequin’s web site. 
Harlequin Inc. 

One Cambridge Center 

Cambridge, MA 02142 

617-374-2400 
http://www.harlequin.com/ 


On Time has announced RTIP, a TCP/IP 
network stack for On time’s 32- and 16-bit 
real-time operating systems. Some of the 
features of RTIP include all of the C source 
code, SLIP/CSLIP and Ethernet drivers, and 
support for the BOOTP, RARP, ARP, ICMP, 
UDP, and TCP protocols. Add-ons for PPP, 
FTP, TFTP, NFS, HTTP, SMTP, POP3, TEL- 
NET, SNMP, and DHCP are also available. 
An RTIP license sells for $7500.00 and is 
royalty free; add-ons cost extra. 

On Time 

88 Christian Avenue 

Setauket, NY 11733 

516-689-6654 


http://www.on-time.com/ 


Pegasus Software has released Smartscan 
Xpress, a 32-bit ActiveX control for imag- 
ing that uses the Active Template Library. 
Smartscan Xpress supports industry formats 
such as Code39, CODABAR, Interleaved 2 
of 5, Code128, UCC128, EAN128, Code93, 
and UPC-A, and automatically detects the 
barcodes in an image. The development kit 
is less than 630 KB in size. 

Pegasus Software 

4522 Spruce Street, Suite 200 

Tampa, FL 33607 

813-875-7575 
http://www.pegasustools.com/ 


Inabyte Software has introduced InaGrid 
1.5, an updated release of Inabyte’s Ac- 
tiveX virtual/unbound grid control. InaGrid 
can support over 999 billion grid rows and 
2 billion columns. InaGrid is 95 KB in size, 


and supplies three editing controls that can 


be used to capture user input: InaEdit, Ina- 
Combo, and InaCheck. InaEdit is a text tool 
that lets users enter data directly into cells; 
InaCombo places drop-down combo box- 
es in grid cells; and InaCheck places check 
boxes into cells. Also, Inabyte provides ac- 
cess to InaEdit’s source code. InaGrid 1.5 
sells for $179.00. 

Inabyte Inc. 

5 Betty Lane 

Novato, CA 94947 

415-883-3407 

http://www.inabyte.com/ 


Gensym has announced NeurOn-Line Stu- 
dio (NOL Studio), a neural-network toolkit 
for facilitating process analysis, modeling, 
and optimization. NOL Studio guides users 
through the process of data preprocessing, 
model configuration, training, validation, 
and deployment. With NOL Studio’s visu- 
alization tools, users can analyze data sets 
with over 100,000 records and 100 variables. 
Applications of NeurOn-Line include infer- 
ential measurements of product quality, 
model-based control, and process fault de- 
tection. NOL Studio can be used off-line, 
for model analysis, or on-line, for real-time 
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model-based diagnostics and control. For 
online use, there are two options: The pre- 
dictive models and optimization capability 
of NOL Studio can be deployed as COM 
objects in embedded applications. Alterna- 
tively, the NOL Studio models can be load- 
ed into Gensym’s G2 software. Pricing for 
NOL Studio starts at $25,000 per user. 
Gensym Corp. 

125 Cambridge Park Drive 

Cambridge, MA 02140 

617-547-2500 

http://www.gensym.com/ 


Bluestone Software has debuted its XwingML 
(pronounced zwing-M-L) software for con- 
verting XML documents into Java/Swing ap- 
plications. XwingML comes with a standard 
Document Type Definition (DTD) that de- 
fines the entire Swing/Java Foundation Class 
(JFC) set of classes and properties, and pro- 
vides support for all Swing/JFC Listeners. 
Complete with sample templates for a wide 
variety of GUI interfaces, users author XML 
documents in English, which XwingML 
reads and then dynamically creates the 
Java GUI. The end-result is the ability to 
easily create a Java GUI without writing 
any Java code. XwingML is available free 
of charge from Bluestone’s web site. 
Also from Bluestone Software is Blue- 
stone XML-Server, a dynamic XML server 
for distributing and deploying XML appli- 
cations. Bluestone XML-Server dynami- 
cally generates and receives XML docu- 
ments in real time and translates them to 
backend data sources. Features include 
support for virtually all networking pro- 
tocols, introduction of Doclets, compre- 
hensive security features, and core XML 
services. Bluestone XML-Server is priced 
at $2995.00 per CPU. 
Bluestone Software Inc. 
1000 Briggs Road 
Mount Laurel, NJ 08054 
609-727-4600 
http://www.bluestone.com/ 


Popkin Software has released System Ar- 
chitect 2001, an Enterprise Modeling tool 
that integrates business, process, compo- 
nent, object and data modeling techniques 
in a single product. System Architect 2001's 
modeling support includes Catalyst and 
IDEF for business and process modeling, 
UML for component and object modeling, 
a new, model-based approach for data 
modeling, and provides complete support 
for traditional structured analysis and de- 
sign techniques. System Architect 2001 in- 
cludes a new 32-bit architecture, DCOM, 
and Microsoft’s Visual Basic for Applica- 
tions (VBA). System Architect 2001 ships 
with a wide selection of generators and in- 
terfaces for application developers. Also in- 
cluded are links to workflow and simula- 
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tion tools, as well as HTML generation for 
model publishing over an intranet/extranet. 
Popkin Software & Systems 

11 Park Place 

New York, NY 10007 

212-571-3434 

http://www.popkin.com/ 


Acumen Systems has released AcuForm 
SDK 2.0, a software development kit for 
integrating Acumen System’s optical 
recognition technology into applications. 
The SDK covers a full range of image 
processing features which have been en- 
hanced for improved image display and 
recognition accuracy including Automatic 
Deskew, Rotation, Flipping, Inverse, Fil- 
tering, Black and White Border Crop- 
ping, Scale-To-Gray, Form Removal, and 
multipage TIFF functions. The SDK is 
available in both ActiveX (OCX) control 
and DLL formats. New recognition en- 
gines have been added to this release 
that recognize a wider variety of data 
formats. The SDK now includes ICR 
(handwritten data) and OCR (printed 
text), in addition to OMR (check boxes) 
and barcode recognition. The AcuForm 
SDK costs $1200.00 plus licensing and 
can be purchased directly from Acumen 
Systems. The SDK requires Windows 
95/NT. 

Acumen Systems Inc. 

1481 47 Street 

Brooklyn, NY 11219 

718-438-5100 
http://www.acumensoft.com/ 


Azalea Software has announced that 
http://www.encryption.com/ now hosts 
the Carrick online server. The Carrick on- 
line server encrypts files through your 
browser for free, effectively eliminating 
the current U.S. export restrictions cov- 
ering strong encryption software. The 
Carrick online server allows you to up- 
load a file and a password to the server 
securely using SSL. The file is encrypt- 
ed or decrypted on the server, ready for 
you to download. Because the encryp- 
tion engine itself resides on Azalea’s web 
server in Seattle, nothing is exported. Ex- 
porting encrypted files isn’t covered by 
export restrictions, only encryption en- 
gines. Therefore, Internet users around 
the world can use the Carrick online 
server legally. The Carrick online server 
is hosted on a co-located Windows NT 
box running Active Server Pages calling 
a Carrick DLL. The Carrick online serv- 
er uses Blowfish with a 448-bit key 
length and can be licensed from Azalea 
Software for intranet and Internet servers. 
Carrick is a family of encryption and 
cryptography SDKs that allow software 
developers to incorporate encryption into 
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their projects. The various Carrick tool- 
kits are based on algorithms such as 
Blowfish, DES, and SHA-1. 

Azalea Software Inc. 

219 1st Avenue South, Suite 410 

Seattle, WA 98104 

800-362-7978 
http://www.encryption.com/ 


MERANT plc has released PVCS Version 
Manager 6.5, which sports a new web-based 
user interface. New features of PVCS Ver- 
sion Manager 6.5 include a nested, hierar- 
chical project structure, support for parallel 
development and large projects with fea- 
tures such as n-way merge, and links to 
complementary PVCS products such as PVCS 
Dimensions and PVCS Tracker. Pricing for 
Version Manager 6.5 starts at $649.00. 
MERANT PVCS 

735 SW 158th Avenue 

Beaverton, OR 97006 

503-645-1150 


http://www.merant.com/ 


Omni-Vista SP (OVSP) 1.0, project planning 
software from Omni-Vista, provides soft- 
ware development teams with the ability to 
visualize and quantify the impact of chang- 
ing requirements, release dates, prices, bud- 
gets, and schedules of a software project. 
When a parameter within any view is ad- 
justed, all other graphical views are imme- 
diately updated to reflect the change. For 
example, you can add a new project re- 
quirement, and instantly see its impact on 
budgets, schedules, release dates, revenue, 
risk, and profit. OVSP is a Windows appli- 
cation that supports multiple levels of undo, 
customizable default views, and data import 
from Microsoft Project, Microsoft Word, and 
ASCII text files. Omni-Vista SP is licensed — 
on a per seat basis at $1295.00 per seat. 
Omni-Vista 

4419 Centennial Boulevard, Suite 222 
Colorado Springs, CO 80907 
719-955-6664 


http://www.omni-vista.com/ 


Inner Media is shipping DynaZIP-AX 4.0, a 
Zip-compatible data-compression toolkit/ 
component for Windows developers. This 
toolkit provides multithreaded operations 
and is fully compatible with Active Server 
Pages. It provides a pair of ActiveX com- 
ponents, one each for Zip and Unzip, which 
are fully self contained. DynaZIP-AX costs 
$149.00 per developer, and is royalty free. 
Inner Media Inc. 

60 Plain Road 

Hollis, NH 03049 

603-465-3216 


http://www.innermedia.com/ 
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SWAINE’S FLAMES 





Moving and Chaos 


W 


ell, we moved. After 10 years in one house, Nancy and I packed up dog and computers 
and trucked off to a new home—Stately Swaine Manor II. The move was prompted by 
Nancy’s entrepreneurial plans; as for me, I work in cyberspace: I could live anywhere. 

Don’t get me wrong; I love our new home. My office is bigger, for one thing. And there are other 
things: Six and a half fertile acres, fruit trees, mountain views, river nearby. Plus the entrepreneurial 
opportunity Nancy was looking for, or make that opportunities: an herb garden, organic farm, and 
restaurant. Yes, it’s good to be here. Although getting here was something else: Moving is back- 
breaking labor and back-breaking labor is a pain. But not the biggest pain, I soon found out. 

We had just begun to unpack boxes when cousin Corbett arrived. 

“When does the restaurant open?” he wanted to know. 

“Restaurant? We don’t even have a refrigerator yet.” 

A look of panic came into his eyes, until he saw the pizza on the counter. “So, how did you 
find this place?” he asked around a mouthful of pepperoni pizza. 

“Nancy found it on the Net.” I pulled up a box to use as a chair and got myself a slice. “It just 
shows how pervasive the Internet is these days. She found this place, researched loans, and 
found software to help her run the restaurant— all on the Net. She even took some farm-related 
courses via Internet distance learning.” 

“That doesn’t mean the Net is pervasive,” he said, “It means you're living with a nerd.” 

I could have challenged that, but I didn’t want to encourage him. 

“To what do we owe the honor of this visit, Corbett?” 

“There must be something around here to drink,” he said, opening cupboards. 

“You came for something to drink?” I asked, thumbing through the phone book to the 
restaurant section. | 

“No, no, I’m here to keep you from making a big mistake. I want to make sure you're setting 
up this business right. Tell me what you’ve done so far.” 

“Well, P’'ve started planning our network—” 

“Your intranet. Is Anchor Steam all you’ve got? No microbrews?” 

“Where did you find those? Give me one. Um, intranet, right. But there are just the two of us 
so far, so I think of it as an ‘intimnet.’ And allocating the hardware. Nancy’s office inherits the 
three generations of obsolete Macs. My office —” 

“Yeah, but have you begun evolving an EBC yet?” 

I was only half-listening, concentrating on the pizza, trying to make sure I got a few pieces. 
I’ve never seen anyone like Corbett for eating and talking at the same time, and doing both of 
them very fast, I might add. 

“Huh? EBC?” 

“E-business community. That’s your first step. Your network of suppliers, distributors, 
commerce providers, and customers. You need to get the relationship grid wired up. The day of 
the megacorporation that controls the whole food chain is over. It’s about relationships now.” 

“I think the position of a farm or restaurant in the food chain is pretty much fixed.” 

“That’s Newtonian thinking. You need to adopt the bionomic model. The business of the future 
is chaotic and self-organizing.” 

“Well, we’ve got the first part down.” 

“Mike, you’re stuck in Newtonian economics. The new economy is a steaming tropical rain 
forest, aswarm with fiscal organisms eating and feeding one another in an orgy of ‘coopetition.” 

“Oh, I think the health department would come down on us for that.” 

“And another thing,” he said, taking the last slice, “offices are out. The new corporation is a 
transient virtual company formed over lunch by like-minded people to solve a particular problem, 
then disbanded before dinner.” 

“Like Hollywood movie deals?” 

“Exactly. Business alfresco con latte. So get the restaurant open soon, so we can talk business. 
And now, is my room ready? You do have cable, don’t you?” 
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